Can I feed the 22050 sr wav to the pre-trained rawnet3 model ?

Jungjee / RawNet

Official repository for RawNet, RawNet2, and RawNet3

MIT License

357 stars 55 forks source link

Can I feed the 22050 sr wav to the pre-trained rawnet3 model ? #31

Closed predawnang closed 1 year ago

predawnang commented 1 year ago

Hi I want to use rawnet3 model in my project to compute the speaker similarity of a pair of wavs. All the audio in my dataset is 22050Hz, for some reason I could down sample those audio to 16000 kHz. I wonder if the pretrained model is suitable to the 22050Hz audio. Thanks

Jungjee commented 1 year ago

Hi @predawnang, yes I believe RawNet3 should work on downsampled 16 kHz waveforms. First downsample your waveforms to 16 kHz and then feed them to the model.

predawnang commented 1 year ago

Dear author, can I directly feed the 22khz wavforms to the model, will it cause performance decrease of the model?

Jungjee commented 1 year ago

I haven't tested that case, but it would be likely that the output representations aren't representative.

In short, I recommend you not to.