C0untFloyd / bark-gui

๐Ÿ”Š Text-Prompted Generative Audio Model with Gradio
MIT License
674 stars 63 forks source link

Chinese audio for less than 15 seconds #36

Closed HaSaKiYasuooo closed 1 year ago

HaSaKiYasuooo commented 1 year ago

I used a Chinese speech model to generate audio, and the generated audio can never exceed 15 seconds. How can I solve this problem?

anacondabitch commented 1 year ago

can you tell me how was the quality of the audio generated by model

C0untFloyd commented 1 year ago

With chinese model you probably mean a chinese speaker and for generation you used Swap Voice? Otherwise there is no 15s limitation in this fork?!?

HaSaKiYasuooo commented 1 year ago

ไฝ ่ƒฝๅ‘Š่ฏ‰ๆˆ‘ๆจกๅž‹็”Ÿๆˆ็š„้Ÿณ้ข‘่ดจ้‡ๅฆ‚ไฝ•ๅ—

Bark good at English but terrible at other languages

HaSaKiYasuooo commented 1 year ago

With chinese model you probably mean a chinese speaker and for generation you used Swap Voice? Otherwise there is no 15s limitation in this fork?!?

So how do I solve the loss of text in the audio when Chinese is generating speech

C0untFloyd commented 1 year ago

Are you sure you're using this fork? As I wrote previously I don't have a audio limit.

Bark good at English but terrible at other languages

I disagree, you just have to find a good speaker voice. At least for the languages I know, it might be different for chinese ๐Ÿ˜

HaSaKiYasuooo commented 1 year ago

I tried many times to use Chinese for audio output, but the results are not satisfactory, Could you please give me some solutions

C0untFloyd commented 1 year ago

Is it this problem? https://github.com/suno-ai/bark/issues/324