Last try on ML experiments for ISMIR 2021

Yesterday I was able to start the "annotation quality" assessment, after going through a "measure alignment" correction of the pieces that had the worst alignment scores.

I found many different causes/reasons why a score might be misaligned to the annotations. These errors were possibly left unsolved for other researchers who have worked with Roman Numeral Analysis and they'll probably have effects on the training of a machine learning model.

I have also discovered that annotations may be wrong in terms of chord-label qualities. So far, I have seen:

The score is transposed (in a different key in comparison to the annotation)
The score has a section in a different key than the annotation file
The translation of a certain chord is wrong (e.g., #viio when it should have been viio)

I am dumping a csv with all the sorted scores of annotation quality and working my way through the files before training a model.

For this csv, I omitted/ignored all the files that have measure misalignment and I that I don't plan to fix for the time being because I ran out of time to do it.

The filtered list includes 354 files. All of these should theoretically have little or no alignment problems.

napulen / phd_thesis

Last try on ML experiments for ISMIR 2021 #7