napulen / phd_thesis

Automatic Roman numeral analysis in symbolic music representations.
1 stars 1 forks source link

Explain why converting corpus manually is necessary #38

Closed napulen closed 2 years ago

napulen commented 2 years ago

During the time I was correcting the corpus manually, I recorded a few videos showcasing scores that break one or more assumptions made by encoders, and that's why I need to inspect/curate them by hand or develop automated tools. I need to find those videos, but in the meantime, here is an example from DCML's Mozart Piano Sonatas:

K333-3 - measure 200 image

What's the length of that measure? How does the musicxml encode that? How does music21 read that? How does the CSV annotation files provided by DCML encode that? Usually instances like these are what breaks and misaligns everything. Then the performance of the network goes down. That's why I curate the data by hand.

napulen commented 2 years ago

I'm sure someone will say "what's the deal with just using the data/tools provided in the paper", I need to make my case by showing examples like this. This kind of thorough revision is what makes my algorithm work better than other models. More than the architecture/etc.