Open maxpv opened 3 years ago
@rabitt
Hey @maxpv
In the first paper from 2007 there are two sections in the Transcription Results section: Frame-level transcription (5.1) and Note onset detection (5.2).
In the documentation
The paper you mentioned, and a second are cited. The equations for all the metrics are there - Equations 3-6 in the first paper, and equations 1-8 in the second. Both papers give pretty lengthly explanations of the metrics if you want more details.
Due to the format of the input for multipitch.evaluate (frequencies associated with an onset) I suppose the 5.2 was mentioned.
There's no notion of onsets in mir_eval.multipitch.evaluate
. Note-level metrics are implemented separately in mir_eval.transcription
.
Hope that helps clarify things.
Thanks for your detailed answer.
I think this should be added to the documentation, it is not obvious that the 5.1 Frame-Level Transcription
is related to the metrics we try to compute with multipitch
or transcription
. Partly because of the input format -frequencies and timestamps, as opposed to an NxT matrix.
Example from the 5.1 section: TP (“true positives”) is the number of correctly transcribed voiced frames (over all notes)
.
Ok, but what is a frame when the input is a list of intervals and pitches? To do that we need to set the offset_min_tolerance
but this parameter isn't exposed in the doc for the mir_eval.transcription.evaluate
.
I'm trying to understand the
multipitch
, the documentation redirects to two papers but I couldn't find anything that explains the metrics:In the first paper from 2007 there are two sections in the Transcription Results section: Frame-level transcription (5.1) and Note onset detection (5.2). Due to the format of the input for
multipitch.evaluate
(frequencies associated with an onset) I suppose the 5.2 was mentioned. There's literally nothing in it that explains the metrics.What am I missing? It seems unnecessary obscure to me.