suno-ai / bark

🔊 Text-Prompted Generative Audio Model
MIT License
35.63k stars 4.19k forks source link

weak ability of chinese #222

Open gnimyang opened 1 year ago

gnimyang commented 1 year ago

I can figure out some chinese words, but it full of english style, most of word sounds more like english not chinese. the model should trained in different language smaples data

cliooox commented 1 year ago

watch more American dramas

gnimyang commented 1 year ago

I am sorry for that issue, using different language you need change the language model, if you are using chinese, you should use the model of zn_speaker0-9, so as to other language, I am so sorry not read the tips clearly, I suggest author outline this information in the document

bghira commented 1 year ago

try using hi_speaker_3 to read Chinese. it is cool.

shawhu commented 1 year ago

I'm curious, from speaker library: https://suno-ai.notion.site/8b8e8749ed514b0cbf3f699013548683?v=bc67cff786b04b50b3ceb756fd05f68c

If I select english, there's only one female voice/speaker(speaker9) available?

and all the chinese ones are not great (tried most of them), they are all with some mild to severe mid-west accent (not talking about the mid-west accent in the US). Or maybe I missed something, in that case, which one you recommend?

And, lastly, it took me a day (actually a decent 2 hours) to troubleshoot and make it to work on windows. I've read the the codes you guys released. One thing I can be sure of is that you guys don't use windows. Those notebook examples, they can't run in windows. And in the code, when you create Allowed_Prompts, in windows the os.path.sep is "\", not "/", which caused a lot of problems and all my precious 2 hours wasted. You guys could have wrote a very short comment in your notebook example, saying SPEAKER = "v2\en_speaker_6" #windows user should do this

and, I read that it will load all models in the memory and took about 12 gb of vram, I only took 5,6gb, what's wrong? I'm pretty sure I didn't add those env variable that load small models.

and, for a 3090, what can i expect the generation speed to be like? how to get a near realtime generation? I read that using a nightly build and fancy gpu you can get a near realtime generation speed, but how fancy? which one exactly?

but that being said, when everything works, it really works fine. the quality of the generated audio are amazing.

mcamac commented 1 year ago

thank you for the feedback

More speaker prompts are definitely planned, and Chinese quality is on our list of improvements. Re Windows - We will add this to the notebook - great point.

What generation speeds are you getting on a 3090?

LiuGuanXin commented 1 year ago

I am sorry for that issue, using different language you need change the language model, if you are using chinese, you should use the model of zn_speaker0-9, so as to other language, I am so sorry not read the tips clearly, I suggest author outline this information in the document

你好,请问如何切换到zn_speaker0-9模型呢

realcarlos commented 1 year ago

I am sorry for that issue, using different language you need change the language model, if you are using chinese, you should use the model of zn_speaker0-9, so as to other language, I am so sorry not read the tips clearly, I suggest author outline this information in the document

你好,请问如何切换到zn_speaker0-9模型呢

I guess, like this: generate_audio(text, history_prompt="v2/en_speaker_3")

YANGYANGZXLQ commented 1 year ago

Why can't you understand the generated Chinese? It's not read according to Chinese. What can't you understand and read

YiQiu1984 commented 1 year ago

I am sorry for that issue, using different language you need change the language model, if you are using chinese, you should use the model of zn_speaker0-9, so as to other language, I am so sorry not read the tips clearly, I suggest author outline this information in the document

Although I have used the model of zn_speaker0-9, most of word also sounds a bit weird

jeffzhengye commented 1 year ago

just curios how the models of chinese are trained? 1. hired west people who can speak chinese to get training data? or 2. transfer learning from English? BTW, tts for English is amazing.

RickyWang111 commented 1 year ago

I think so ,this Chinese pronunciation is difficult for local Chinese people to understand.

yiv commented 1 year ago

beg a pardon, any progress on this issue or schedule

zhengjiedna commented 1 year ago

It's been four months and nothing seems to be happening

JonathanFly commented 1 year ago

If a native Chinese speaker has some time, you can improve Bark output for the community overall by simply using your ears. Give Bark lots of Chinese text prompts, long prompts 3 sentences or more. Use random voices. Save all the bark outputs as new voices with the save_as_prompt() function. (Don't use the same text prompt every time, try a lot of different ones.)

Listen to the outputs. If every single Bark chinese sample with a random voice has a bad accent, maybe it's a problem in Bark. If even a small amount of the voices have a good accent, then post those samples here or somewhere as new Chinese Bark voices. Post these good accent samples in .npz format, not the .wav files. Those can be used a new Bark voices.

shirubei commented 11 months ago

If a native Chinese speaker has some time, you can improve Bark output for the community overall by simply using your ears. Give Bark lots of Chinese text prompts, long prompts 3 sentences or more. Use random voices. Save all the bark outputs as new voices with the save_as_prompt() function. (Don't use the same text prompt every time, try a lot of different ones.)

Listen to the outputs. If every single Bark chinese sample with a random voice has a bad accent, maybe it's a problem in Bark. If even a small amount of the voices have a good accent, then post those samples here or somewhere as new Chinese Bark voices. Post these good accent samples in .npz format, not the .wav files. Those can be used a new Bark voices.

After a try of 40 sentences(in the file attached), it all failed to create the correct accent and some were noise only. Please notify me with any progress . I would like to help. prompts_checked.txt

bghira commented 10 months ago

from what i have been told, Suno will no longer be updating weights or releasing models. i don't think they are interested in fixing these issues.