Sampling Rates (SR) above 16 kHz

shahules786 / mayavoz

Pytorch based speech enhancement toolkit.

MIT License

328 stars 21 forks source link

Sampling Rates (SR) above 16 kHz #33

Closed cweaver-logitech closed 1 year ago

cweaver-logitech commented 1 year ago

Thanks for this nice package. I'm curious if there are any models that have been trained with a sampling rate about 16 kHz?

Exception has occurred: ParameterError Mono data must have shape (samples,). Received shape=(1, 320000)

shahules786 commented 1 year ago

Hi @cweaver-logitech, thanks for using mayavoz. All the currently available pretrained models are trained at 16Khz. This is due to two reasons

16Khz is the recommended SR in the corresponding model architecture paper
Training on higher SR requires better GPU resources than I have got.

Although this is not a constraint for doing inference with input at any sampling rate. Mayavoz will automatically resample the input to the required model sampling rate.

Can you share some more details about the error?

cweaver-logitech commented 1 year ago

Thanks for the quick reply. No need to look any further as I want the input sample rate (44.1 and 48) to be preserved when the inference audio is written.

shahules786 commented 1 year ago

Thanks for the clarification @cweaver-logitech . I see you have a point there. Currently when writing output mayavoz uses model sampling rate but if you chose to return the output mayavoz returns the output with input sampling rate. I have opened another issue here https://github.com/shahules786/mayavoz/issues/34