Re-training for alternate samplerate

marl / crepe

CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)

https://marl.github.io/crepe/

MIT License

1.12k stars 160 forks source link

Re-training for alternate samplerate #83

Open mrdeveloperdude opened 2 years ago

mrdeveloperdude commented 2 years ago

I noticed that CREPE seems to be trained on sample data in 16kHz. I want to use this in an open source karaoke software I am working on which is designed to run on limited embedded hardware and so doing sample rate conversion on the fly is an expense I would like to avoid. I thought that it would be easy to fix by simply retraining CREPE by using my native samplerate instead (44.1kHz), but I could not find any code to train the models.

Was this left out on purpose? Is it available someplace?

Thanks!

jongwook commented 2 years ago

Hi, the training code is available at: https://github.com/jongwook/crepe but it is even less maintained than this one.

One hack you could do is, just subsample every 3 (or 2) samples effectively making a 14.7 kHz (or 22.05 kHz) audio, run them through one of the pretrained CREPE models, and scale the frequency estimate by 16/14.7 (or 16/22.05) to get the actual frequencies. This assumes that there are negligible energy in the above-Nyquist band (which is generally true and worked okay for the web demo), and also that the accuracy is not terribly impacted by the frequency scaling.

PratikStar commented 2 years ago

Isn't the original model trained for 16k sample rate and NOT 44.1kHz?

Laubeee commented 1 month ago

Thanks for sharing your training code Could you say something about:

the usage of the NSynth dataset: was the full dataset used? only the notes in the range? only parts of the files or any other filtering methods (as towards the end the files are often just silence)?
I see there is an option to use noise and pitch-shift augmentations, were they used in the final model?