Explain why converting corpus manually is necessary

During the time I was correcting the corpus manually, I recorded a few videos showcasing scores that break one or more assumptions made by encoders, and that's why I need to inspect/curate them by hand or develop automated tools. I need to find those videos, but in the meantime, here is an example from DCML's Mozart Piano Sonatas:

K333-3 - measure 200

What's the length of that measure? How does the musicxml encode that? How does music21 read that? How does the CSV annotation files provided by DCML encode that? Usually instances like these are what breaks and misaligns everything. Then the performance of the network goes down. That's why I curate the data by hand.

napulen / phd_thesis

Explain why converting corpus manually is necessary #38