Open gnimyang opened 1 year ago
watch more American dramas
I am sorry for that issue, using different language you need change the language model, if you are using chinese, you should use the model of zn_speaker0-9, so as to other language, I am so sorry not read the tips clearly, I suggest author outline this information in the document
try using hi_speaker_3
to read Chinese. it is cool.
I'm curious, from speaker library: https://suno-ai.notion.site/8b8e8749ed514b0cbf3f699013548683?v=bc67cff786b04b50b3ceb756fd05f68c
If I select english, there's only one female voice/speaker(speaker9) available?
and all the chinese ones are not great (tried most of them), they are all with some mild to severe mid-west accent (not talking about the mid-west accent in the US). Or maybe I missed something, in that case, which one you recommend?
And, lastly, it took me a day (actually a decent 2 hours) to troubleshoot and make it to work on windows. I've read the the codes you guys released. One thing I can be sure of is that you guys don't use windows. Those notebook examples, they can't run in windows. And in the code, when you create Allowed_Prompts, in windows the os.path.sep is "\", not "/", which caused a lot of problems and all my precious 2 hours wasted. You guys could have wrote a very short comment in your notebook example, saying SPEAKER = "v2\en_speaker_6" #windows user should do this
and, I read that it will load all models in the memory and took about 12 gb of vram, I only took 5,6gb, what's wrong? I'm pretty sure I didn't add those env variable that load small models.
and, for a 3090, what can i expect the generation speed to be like? how to get a near realtime generation? I read that using a nightly build and fancy gpu you can get a near realtime generation speed, but how fancy? which one exactly?
but that being said, when everything works, it really works fine. the quality of the generated audio are amazing.
thank you for the feedback
More speaker prompts are definitely planned, and Chinese quality is on our list of improvements. Re Windows - We will add this to the notebook - great point.
What generation speeds are you getting on a 3090?
I am sorry for that issue, using different language you need change the language model, if you are using chinese, you should use the model of zn_speaker0-9, so as to other language, I am so sorry not read the tips clearly, I suggest author outline this information in the document
你好,请问如何切换到zn_speaker0-9模型呢
I am sorry for that issue, using different language you need change the language model, if you are using chinese, you should use the model of zn_speaker0-9, so as to other language, I am so sorry not read the tips clearly, I suggest author outline this information in the document
你好,请问如何切换到zn_speaker0-9模型呢
I guess, like this: generate_audio(text, history_prompt="v2/en_speaker_3")
Why can't you understand the generated Chinese? It's not read according to Chinese. What can't you understand and read
I am sorry for that issue, using different language you need change the language model, if you are using chinese, you should use the model of zn_speaker0-9, so as to other language, I am so sorry not read the tips clearly, I suggest author outline this information in the document
Although I have used the model of zn_speaker0-9, most of word also sounds a bit weird
just curios how the models of chinese are trained? 1. hired west people who can speak chinese to get training data? or 2. transfer learning from English? BTW, tts for English is amazing.
I think so ,this Chinese pronunciation is difficult for local Chinese people to understand.
beg a pardon, any progress on this issue or schedule
It's been four months and nothing seems to be happening
If a native Chinese speaker has some time, you can improve Bark output for the community overall by simply using your ears. Give Bark lots of Chinese text prompts, long prompts 3 sentences or more. Use random voices. Save all the bark outputs as new voices with the save_as_prompt() function. (Don't use the same text prompt every time, try a lot of different ones.)
Listen to the outputs. If every single Bark chinese sample with a random voice has a bad accent, maybe it's a problem in Bark. If even a small amount of the voices have a good accent, then post those samples here or somewhere as new Chinese Bark voices. Post these good accent samples in .npz format, not the .wav files. Those can be used a new Bark voices.
If a native Chinese speaker has some time, you can improve Bark output for the community overall by simply using your ears. Give Bark lots of Chinese text prompts, long prompts 3 sentences or more. Use random voices. Save all the bark outputs as new voices with the save_as_prompt() function. (Don't use the same text prompt every time, try a lot of different ones.)
Listen to the outputs. If every single Bark chinese sample with a random voice has a bad accent, maybe it's a problem in Bark. If even a small amount of the voices have a good accent, then post those samples here or somewhere as new Chinese Bark voices. Post these good accent samples in .npz format, not the .wav files. Those can be used a new Bark voices.
After a try of 40 sentences(in the file attached), it all failed to create the correct accent and some were noise only. Please notify me with any progress . I would like to help. prompts_checked.txt
from what i have been told, Suno will no longer be updating weights or releasing models. i don't think they are interested in fixing these issues.
I can figure out some chinese words, but it full of english style, most of word sounds more like english not chinese. the model should trained in different language smaples data