Closed Manamama closed 2 years ago
Imho this isn't a bug. You could increase the max_decoder_steps
in config.json, but this might lead to problems synthesizing shorter phrases. Or split your text and run synthesizing for multiple shorter phrases.
Thanks.
max_decoder_steps
in the config.json located in the compiled:
~/.local/lib/python3.8/site-packages/TTS/server/conf.json
nor in
~/.local/lib/python3.8/site-packages/TTS/tts
I could not find it in the jsons in the original cloned files that quick, either.
It also seems to be missing from the tts command line parameters, as well. -> Where is that option located? (Do bear with me here, as I am not a developer or programmer.)
If "it's not a bug but a feature", don't the input get split automatically?:
> Text splitted to sentences.
to accommodate for this?
FYI, the garbled output happens also with other models, but does not happen that often with other sample longer texts; yet I have tested 5 samples so far only.
Another bug, but most probably due to the paucity of the models themselves, the Dearest Creature in Creation poem is mispronounced, while e.g. Google's and Nuance TTS handle it fine.
I'd suggest trying other English models.the default model can sometimes garble.
Comparing this open source project with Google and Nuance is flattering but hard to match :)
I move this to discussions as it it not a dev issue.
Describe the bug
Generation of longer phrases seems to garble the output after about the 30 second timestamp. The resulting files sound random: one was 60 second long, while this attached lasts only 40 seconds.
My box:
To Reproduce
tts --text "Nowhere in those kerosene years could she find a soft-headed match. The wife crossed over an ocean, red-faced and cheerless. She traded the flat pad of a stethoscope for a dining hall spatula. Life is two choices, she thinks: you hatch a life, or you pass through one. Photographs of a child swaddled in layers arrived by post. Money didn’t, to her embarrassment." --out_path test.wav && mplayer test.wav
test1.wav.tar.gz test3.wav.tar.gzExpected behavior
Non-garbled WAV file
Logs
Additional context
The generated WAV files have random length.