I had a 44100Hz dataset which I downsampled to 40000Hz prior to training. (I don't understand why 44100Hz datasets aren't supported, but that's another issue)
Training completed successfully.
When I tested the model using "inference" tab with audio recorded in-browser, both speech rate and pitch were much too fast.
Suspecting the sampling rate was an issue, I recorded my test audio in Audacity at 40000Hz, uploaded it, converted it, and found that the output was normal.
Suggest that mic audio collected in the Inference tab be changed to the model sampling rate prior to voice conversion.
I had a 44100Hz dataset which I downsampled to 40000Hz prior to training. (I don't understand why 44100Hz datasets aren't supported, but that's another issue)
Training completed successfully.
When I tested the model using "inference" tab with audio recorded in-browser, both speech rate and pitch were much too fast. Suspecting the sampling rate was an issue, I recorded my test audio in Audacity at 40000Hz, uploaded it, converted it, and found that the output was normal.
Suggest that mic audio collected in the Inference tab be changed to the model sampling rate prior to voice conversion.