Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Aishell3 vc2 example not using 16k sample rate while extracting spk embedding
Although it checks the sample rate of input audio but I don't see any process that handles the resampling work. https://github.com/PaddlePaddle/PaddleSpeech/blob/9cf8c1985a98bb380c183116123672976bdfe5c9/paddlespeech/cli/vector/infer.py#L492-L497
And you can see when the code excutes to this line, sr equals 44100 if you print sr here. https://github.com/PaddlePaddle/PaddleSpeech/blob/9cf8c1985a98bb380c183116123672976bdfe5c9/paddlespeech/cli/vector/infer.py#L415
Thus, melspectrogram receives a waveform loaded with 44100 sample rate and a mismatch sample rate
self.config.sr
which equals 16000 https://github.com/PaddlePaddle/PaddleSpeech/blob/9cf8c1985a98bb380c183116123672976bdfe5c9/paddlespeech/cli/vector/infer.py#L422-L427