craffel / mir_eval

Evaluation functions for music/audio information retrieval/signal processing algorithms.
MIT License
604 stars 112 forks source link

[question] chord encoding with inversions #237

Open bmcfee opened 7 years ago

bmcfee commented 7 years ago

I ran into this issue while playing around with mir_eval.chord.encode. The interpretation of inversions isn't totally obvious, or well-documented.

Example:

In [2]: mir_eval.chord.encode('G:min')
Out[2]: (7, array([1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0]), 0)

In [3]: mir_eval.chord.encode('G:min/3')
Out[3]: (7, array([1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0]), 4)

In [4]: mir_eval.chord.encode('G:min/b3')
Out[4]: (7, array([1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0]), 3)

I would expect G:min/3 to put the (minor) third in the bass, but the encoder is interpreting /3 as natural (second example has both a 4 and 5 in the pitch class encoding). Inversion with a flat-3 does the right thing (last row).

Does this seem counter-intuitive to anyone else? It's not exactly clear in Harte's paper, but since they don't provide notation for distinguishing natural from sharp/flat, I think the only reasonable interpretation is that inversions must be encoded as absolute pitch class. This should probably be indicated in the documentation somewhere.

As a side note, this might lead to some transcription errors in existing datasets. For example, rev#9 is annotated with a min/3 in the isophonics corpus, and billboard has several such examples. It's not obvious to me that these are errors, but it seems likely.

bmcfee commented 7 years ago

tagging @ejhumphrey because of course

mattmcvicar commented 7 years ago

@bmcfee this is the expected behaviour given the Harte notation - although I agree it could be clearer.

I think an absolute pitch class would will work fine, except that you lose some enharmonic information. For example, I can imagine some scenarios in which I really want to know that I have C:maj/#4 and not C:maj/b5. Probably not many scenarios, but some.

ejhumphrey commented 7 years ago

Does this seem counter-intuitive to anyone else?

haha, yep.

As a side note, this might lead to some transcription errors in existing datasets. For example, rev#9 is annotated with a min/3 in the isophonics corpus, and billboard has several such examples. It's not obvious to me that these are errors, but it seems likely.

I've found it's best practice to discard data labeled as such rather than try to map it. If it's got an uncommon label, it's because something weird is going on, and probably requires closer inspection.

Probably not many scenarios, but some.

haha plenty to be frustrating.... @mattmcvicar you remember the D:maj/#1 or whatever.

bmcfee commented 7 years ago

I think an absolute pitch class would will work fine, except that you lose some enharmonic information.

Sure -- my question was more along the lines of "is this how annotators interpret the notation?"

I've found it's best practice to discard data labeled as such rather than try to map it. If it's got an uncommon label, it's because something weird is going on, and probably requires closer inspection.

Agreed. Nobody ever includes rev#9 anyway. OTOH, it's a small number of cases we're talking about, and probably less than an hour's worth of work to verify and/or correct the annotations.

This problem came up for me because I want to implement a chord normalization function. Given an input chord and a target vocab complexity (min/maj, +dim/aug, +sevenths, etc), it would standardize the root (sharps only) and eliminate inversions and unnecessary extensions. I figured the natural way to do it would be to go through the r/p/b encoding, but this minor-inversion thing is throwing a wrench in the works.

craffel commented 7 years ago

Is this resolved?

bmcfee commented 7 years ago

Is this resolved?

Conceptually, yes. It would still be good to have a PR to improve the relevant documentation for this issue though.

craffel commented 7 years ago

Ok, I will leave it to you all to decide how best to improve the docs (since I don't understand the issue and probably won't).