Closed mrgloom closed 5 years ago
As a general rule of thumb if tacotron (without gst) does not work, then tacotron_gst probably won't work. It is possible that the style wav does not work well with the gst model. Other that that, I am not too sure. Conditional speech synthesis is a more difficult problem than unconditional speech synthesis
Looks like tacotron_gst produce bad results for short sentences:
time python run.py --config_file=example_configs/text2speech/tacotron_gst.py --mode=infer --infer_output_file=unused
I have tried:
Why tacotron2 gst have this limitations? How it can be fixed?
samples.zip