Chinese audio for less than 15 seconds

C0untFloyd / bark-gui

🔊 Text-Prompted Generative Audio Model with Gradio

MIT License

674 stars 63 forks source link

Closed HaSaKiYasuooo closed 1 year ago

HaSaKiYasuooo commented 1 year ago

I used a Chinese speech model to generate audio, and the generated audio can never exceed 15 seconds. How can I solve this problem?

anacondabitch commented 1 year ago

can you tell me how was the quality of the audio generated by model

C0untFloyd commented 1 year ago

With chinese model you probably mean a chinese speaker and for generation you used Swap Voice? Otherwise there is no 15s limitation in this fork?!?

HaSaKiYasuooo commented 1 year ago

你能告诉我模型生成的音频质量如何吗

Bark good at English but terrible at other languages

HaSaKiYasuooo commented 1 year ago

With chinese model you probably mean a chinese speaker and for generation you used Swap Voice? Otherwise there is no 15s limitation in this fork?!?

So how do I solve the loss of text in the audio when Chinese is generating speech

C0untFloyd commented 1 year ago

Are you sure you're using this fork? As I wrote previously I don't have a audio limit.

Bark good at English but terrible at other languages

I disagree, you just have to find a good speaker voice. At least for the languages I know, it might be different for chinese 😏

HaSaKiYasuooo commented 1 year ago

I tried many times to use Chinese for audio output, but the results are not satisfactory, Could you please give me some solutions

C0untFloyd commented 1 year ago