I am experiencing this problem for a long time with many different models I was able to train.
At inference time, if the sentence contains 3 or more words it will work correctly most of the times. On the other hand, if a sentence is 1-2 words long ("Hi", "Hello"), it will generate unintelligible speech 90% of the times.
Hi,
I am experiencing this problem for a long time with many different models I was able to train.
At inference time, if the sentence contains 3 or more words it will work correctly most of the times. On the other hand, if a sentence is 1-2 words long ("Hi", "Hello"), it will generate unintelligible speech 90% of the times.
Any idea what might be causing this?