liebharc / homr

homr is an Optical Music Recognition (OMR) software designed to transform camera pictures of sheet music into machine-readable MusicXML format.
GNU Affero General Public License v3.0
6 stars 3 forks source link

Check the data sets #1

Open liebharc opened 1 month ago

liebharc commented 1 month ago

Introduction

The efficacy of a transformer model is significantly influenced by the quality of its training data. However, the original training dataset utilized by https://github.com/NetEase/Polyphonic-TrOMR/tree/master remains unpublished. Consequently, this repository relies on https://github.com/liebharc/Polyphonic-TrOMR/tree/master, which trains the transformer on datasets sourced from https://grfia.dlsi.ua.es/primus/, https://sites.google.com/view/multiscore-project/datasets, and https://github.com/itec-hust/CPMS. Notably, for the grandstaff dataset, extensive preprocessing is essential, including the segmentation of the grandstaff into individual staves. In the past, significant improvements in performance have been achieved through rectifying errors in datasets, such as stave segmentation, accidental placement, or the conversion of humdrum files into the TrOMR semantic format.

The Task Itself

It would be helpful to have another set of eyes go through all the datasets, especially the grandstaff one. Just take a peek at some random staff images and their corresponding semantic representations. If you spot any issues, we should either tweak our preprocessing methods to fix them or just kick those problematic cases out of the datasets. That way, we won't confuse the transformer during training.

Update

The CPMS dataset has been removed for now. And the "Lieder" dataset has been added. The task itself remains important.

liebharc commented 1 week ago

With the changes which lead to v0.2.0 the most severe issues in the data sets should be fixed. The fixes lead to a significant improvement in performance. I'll leave this issue open, as a 2nd pair of eyes would be really useful.