change frame rate of estimated pitch

farmaker47 / Pitch_Estimator

Music Pitch detection using Tensorflow SPICE model.

72 stars 18 forks source link

change frame rate of estimated pitch #2

Open james20141606 opened 4 years ago

james20141606 commented 4 years ago

hey, I am also trying to use SPICE to estimate f0, but I found the model in tf hub did not allow me to change the frame rate of the pitch, seems like the frame rate is 32, do you know if we could change the frame rate?

farmaker47 commented 4 years ago

After communicating with the author of the blog post Luiz Gustavo Martins, I can give you his answer:

I don't think this parameter can be changed. They might need to group or interpolate the values after inference.

Cheers

james20141606 commented 4 years ago

thanks for the reply! so the question is can we believe the interpolation result?

farmaker47 commented 4 years ago

I do not know about the interpolation. Why do you want to change the value (32ms)? If you executed the colab notebook and used acapella song the output is extraordinary!

james20141606 commented 4 years ago

I really like the estimation result SPICE gives, it is amazing. The reason I'd like to change the frame rate is I'd like to use f0 to synthesize audio (with loudness and other components) and I'd like to experiment with difference frame rate. I have tried CREPE(also a DNN model), RAPT, and SWIPE, and they allow me to change the frame rate, so I wonder if SPICE could also give this option.

farmaker47 commented 4 years ago

Nice! You have worked a lot with sound! If i stuck somewhere can I ask you some questions in the future?

james20141606 commented 4 years ago

actually I am also a beginner, just started working on it a year ago. I like discussing the audio signal processing problems! The work I am following is this: https://github.com/magenta/ddsp, it is really amazing.

farmaker47 commented 4 years ago

ok thanks!

james20141606 commented 4 years ago

I am trying to change the f0 estimator in their framework, they previously use CREPE and I found it not robust, and they also try to learn f0 using a resnet which leads to some f0 collapse issues. They have a following up paper called DDSP-inv which solves the f0 learning issue by self-supervision.