Inaccurate pitch_tolerance in transcription.precision_recall_f1_overlap

ax-le commented 5 years ago

Computing transcription.precision_recall_f1_overlap([...], pitch_tolerance = 30) gives lower statistical outputs than with the default value of pitch_tolerance, transcription.precision_recall_f1_overlap([...], pitch_tolerance = 50) in my examples. (For example, F-measure is equal to 0.379 with a pitch_tolerance set to 30 and to 0.396 when pitch_tolerance is set to 50). However, my estimated pitches are midi-scale integers, so as my ground truth. In that sense, the minimal positive gap between an estimated pitch and the ground truth is a semi-tone, or 100 cents. Hence, a tolerance smaller than 100 cents shouldn't affect the statistical outputs. I highly suspect a rounding operation misleading the pitch comparison.

craffel commented 5 years ago

Thanks for reporting this. Can you provide a MWE or example files which reproduce the issue?

ax-le commented 5 years ago

Sure, here are two .txt files reference.txt and estimation.txt reproducing the issue. The first and second columns of the files contain respectively the onset and offset times, and the third ones the pitches. In my tests, offset times were ignored.

craffel commented 5 years ago

Sorry, I missed an important detail in your first message. You wrote

my estimated pitches are midi-scale integers

That's the wrong format for transcription.precision_recall_f1_overlap. The docstring clearly says

Array of estimated pitch values in Hertz

You can convert from your MIDI pitches to Hz using either pretty_midi.note_number_to_hz or librosa.midi_to_hz or just 440.0*(2.0**((note_number - 69)/12.0)).

craffel / mir_eval

Inaccurate pitch_tolerance in transcription.precision_recall_f1_overlap #313