JamesOwers / midi_degradation_toolkit

A toolkit for generating datasets of midi files which have been degraded to be 'un-musical'.
MIT License
38 stars 5 forks source link

Some csvs have duplicate notes. #122

Closed apmcleod closed 4 years ago

apmcleod commented 4 years ago

Maybe from PPDD? For example: 00113f0b-d51e-4f14-9e89-eff29c257425.csv

(Previously #121 )

apmcleod commented 4 years ago

Another example: acme/clean/PPDDSep2018Polyphonic/8098645f-61de-4363-ac0e-df6f58a67cba.csv

    onset   track   pitch   dur
0   125 0   33  83
1   125 0   33  83
2   125 0   33  83
3   125 0   71  125
4   125 0   83  125
... ... ... ... ...
132 3375    0   76  83
133 3500    0   67  125
134 3500    0   79  125
135 3500    0   79  125
136 3500    0   79  125

I can also confirm that they all come from PPDD-Polyphonic. I recommend recreating acme without PPDD-Polyphonic.

apmcleod commented 4 years ago

This is a bug with data_structures.fix_overlaps, I confirm.

apmcleod commented 4 years ago

The issue was that fix_overlaps was only finding notes that overlap with the first note of each pitch (or notes that overlap with it). We didn't cover this case with any tests, but now we do (in amt PR).