xavriley / crepe_notes

Post-processing for CREPE to turn f0 pitch estimates into discrete notes e.g. MIDI
GNU General Public License v3.0
20 stars 1 forks source link

Questions regarding the input audio and output MIDI. #13

Closed paranoid2droid closed 1 year ago

paranoid2droid commented 1 year ago

Thanks for creating this awesome package.

I have a few questions during my trials:

  1. I tried transcribing two separated tracks from one song (bass/vocal) and I found that the tempo of predicted MIDI is different. It seems that the bass transcription gives a better estimation of the tempo with the beats at the first note of the bars. The transcribed vocal track has a bpm of 120, which seems like a default setting. How is the tempo estimated? Can I manually designate the tempo in output MIDIs with the same absolute time?
  2. Are there any effects if I use the same input track with different amplitudes, e.g., an unnormalized one and a normalized one?

Thanks!

xavriley commented 1 year ago

Thanks for creating this awesome package.

You're welcome! It's nice to see people using it

I found that the tempo of predicted MIDI is different

You can see here in the code that we don't make any predictions about midi tempo so it will default to 120bpm like you are seeing with the vocal. It sounds like the program you are opening them with is attempting to guess based on the notes - which program are you using?

Can I manually designate the tempo in output MIDIs with the same absolute time?

There's a slightly hacky way to do this using the pretty_midi library - I've uploaded a gist here https://gist.github.com/xavriley/c9cdd7bb910246a730a3dab6109237ef

It uses the private _tick_scales property to set downbeats arbitrarily without changing the existing note placements. The syncpoints file is something you can create with www.soundslice.com, or just substitute it for a list of downbeat times in seconds.

Are there any effects if I use the same input track with different amplitudes, e.g., an unnormalized one and a normalized one?

To be honest I'm not 100% sure. CREPE is fairly robust to different noise levels, but madmom (which we use to separate repeated notes at the same pitch) might be affected more.

paranoid2droid commented 1 year ago

Thanks for your quick and detailed reply!

You can see here in the code that we don't make any predictions about midi tempo so it will default to 120bpm like you are seeing with the vocal. It sounds like the program you are opening them with is attempting to guess based on the notes - which program are you using?

I think you are right. I was using MuseScore to see the sheet music of MIDI and it might be guessing the tempo as you mentioned.

To be honest I'm not 100% sure. CREPE is fairly robust to different noise levels, but madmom (which we use to separate repeated notes at the same pitch) might be affected more.

I tried using unnormalized and normalized audio for transcription and it seems that the effects are very small, at least for my tested file.

Thanks for your sharing and nice work again!