When using a recording from the LibriSpeech downloaded dataset, a good ratio of the generated audio pieces sound good and accurate. However, whenever I record some audio and use that, no matter who the speaker is, all the generated audio pieces sound the same. Is there any way I can fix this, or am I not understanding how to use this tool correctly? I've seen others on YouTube using the tool the same way I am and the resulting audio clips sound far better than my own,
When using a recording from the LibriSpeech downloaded dataset, a good ratio of the generated audio pieces sound good and accurate. However, whenever I record some audio and use that, no matter who the speaker is, all the generated audio pieces sound the same. Is there any way I can fix this, or am I not understanding how to use this tool correctly? I've seen others on YouTube using the tool the same way I am and the resulting audio clips sound far better than my own,