fishaudio / fish-speech

Brand new TTS solution
https://speech.fish.audio
Other
14.48k stars 1.1k forks source link

Does v1.4 model respect punctuation? #542

Open hoveychen opened 2 months ago

hoveychen commented 2 months ago

Self Checks

1. Is this request related to a challenge you're experiencing? Tell me about your story.

We've tried synthesizing voices in both English and Chinese dialog scripts, and it seems to ignore the punctuation like question mark.

Example:

2. Additional context or comments

No response

3. Can you help us with this feature?

czkoko commented 2 months ago

I found that if there is a semicolon ; in the sentence, the whole result will become a noise.

Stardust-minus commented 2 months ago

The training data has been processed by punctuation normalizer, so you should normalize the normalizer in inference.

leng-yue commented 2 months ago

Not sure why, but technically puncs are normalized here: https://github.com/fishaudio/fish-speech/blob/c7c8c943c966a03a85ce4a61bca605f1d9bf7567/fish_speech/text/clean.py#L28 We are finetuning the model to let it understand different puncs.