Closed guozixunnicolas closed 3 years ago
You have to set the arguments to be the same as the original CREPE implementation. The following produces the desired result:
python -m torchcrepe --audio_files bass_synthetic_009-025-127.wav --output_files bass.pt --decoder argmax --fmin 0 --gpu 0
Best, Max
@maxrmorrison The readme says 'weighted argmax (as in the original implementation)', not argmax. Can you clarify? Thank you.
See the README section on decoding preceding that as well as Sections II-A and IV of this paper for clarification
Hi there,
Thanks for the re-implementation! It's really well-formatted.
I have encountered some issues regarding the prediction value. I used one sample from Nsynth dataset as the inputfile(bass_synthetic_009-025-127.wav). Check file here:https://drive.google.com/file/d/1_Ltj9Pbezx_5Ve-MLVrkF924vAfJ6j2C/view?usp=sharing
The label of the file shows it has midi pitch 25 which, after some proper calculation, is equivalent to around 34Hz.
However, when I run the algo it returns me
which seems incorrect.
I run the original crepe tf version and it returns me around 34 or 35Hz.
May I know what causes the error, or maybe the data did you train the model with didn't include music data?
Best,
Nic