IAHispano / Applio

VITS-based Voice Conversion focused on simplicity, quality and performance.
https://applio.org
MIT License
1.36k stars 230 forks source link

Sampling rate mismatch when recording audio with mic in browser in Inference tab #414

Closed domesticatedviking closed 2 months ago

domesticatedviking commented 2 months ago

I had a 44100Hz dataset which I downsampled to 40000Hz prior to training. (I don't understand why 44100Hz datasets aren't supported, but that's another issue)

Training completed successfully.

When I tested the model using "inference" tab with audio recorded in-browser, both speech rate and pitch were much too fast. Suspecting the sampling rate was an issue, I recorded my test audio in Audacity at 40000Hz, uploaded it, converted it, and found that the output was normal.

Suggest that mic audio collected in the Inference tab be changed to the model sampling rate prior to voice conversion.

blaisewf commented 2 months ago

We can't control that, that's what Gradio offers us.