CjangCjengh / MoeGoe

Executable file for VITS inference
MIT License
2.32k stars 251 forks source link

Japanese phonemes issue? #24

Closed Norgus closed 1 year ago

Norgus commented 1 year ago

I noticed there was an issue with all 3 models I tested (and I tested at least 2 voices in each model) all had issues pronouncing Yoon the (non-exhaustive but extensive) test text is

キャッチャー・ちょうきょう・ちゅうきゅう・カチューシャ・ショート・シュート・ひょうじ・ヒューズ・ファイア

Samples can be found here: MoeGoe_voices_test.zip

In order the models used were Bishoujo Mangekyou, HamidashiCreative, and Zero_no_Tukaima

It doesn't sound like this rather essential core part of Japanese speech is being correctly reproduced in any of the models I tested.

CjangCjengh commented 1 year ago

I tested the models of Zerotsukai and Hamikuri with the same text. It seemed quite normal (though with strange accent). As for Bishoujo Mangekyou, the auther is not me. demo.zip

Norgus commented 1 year ago

Was the output comparable to what I got? I'd like to know if it's a local problem or they really sound like that.

On Sat, 19 Nov 2022, 07:47 CjangCjengh, @.***> wrote:

I tested the models of Zerotsukai and Hamikuri with the same text. It seemed quite normal (though with strange accent). As for Bishoujo Mangekyou, the auther is not me. demo.zip https://github.com/CjangCjengh/MoeGoe/files/10046429/demo.zip

— Reply to this email directly, view it on GitHub https://github.com/CjangCjengh/MoeGoe/issues/24#issuecomment-1320827022, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA7XGBVH4VGSEQPXZI43DLWJCAZJANCNFSM6AAAAAASE7Q6CE . You are receiving this because you authored the thread.Message ID: @.***>

Norgus commented 1 year ago

I just noticed I haven't tried your models yet, I'll report back after trying them.

Edit: I didn't read properly on the phone, the two you mentioned were two of my test cases!

The question in my previous post remains though: was your output different from what I got?

On Sat, 19 Nov 2022, 09:17 Neil Gordon, @.***> wrote:

Was the output comparable to what I got? I'd like to know if it's a local problem or they really sound like that.

On Sat, 19 Nov 2022, 07:47 CjangCjengh, @.***> wrote:

I tested the models of Zerotsukai and Hamikuri with the same text. It seemed quite normal (though with strange accent). As for Bishoujo Mangekyou, the auther is not me. demo.zip https://github.com/CjangCjengh/MoeGoe/files/10046429/demo.zip

— Reply to this email directly, view it on GitHub https://github.com/CjangCjengh/MoeGoe/issues/24#issuecomment-1320827022, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA7XGBVH4VGSEQPXZI43DLWJCAZJANCNFSM6AAAAAASE7Q6CE . You are receiving this because you authored the thread.Message ID: @.***>

CjangCjengh commented 1 year ago

Was your output different from what I got?

Yes. The two wav files included in demo.zip are my outputs

Norgus commented 1 year ago

Thanks I was finally able to work out opening the zip on my phone and your output does sound right to me. Ill try a fresh download when I get back to my desktop, see if it fixes itself

Norgus commented 1 year ago

Sorry I'm not sure what I broke before but it works correctly on a fresh download.