sachindae / polyphonic-omr

Code used in research that led to the paper "An Empirical Evaluation of End-to-End Polyphonic Optical Music Recognition" (ISMIR 2021)
19 stars 5 forks source link

Having some problems with data generation #1

Open Shayebh opened 2 years ago

Shayebh commented 2 years ago

Hi, thanks for sharing your code,I have some troubles with data generation。In the readme of the "label_gen" folder, how can I generate multiple pngs for the fourth content( Run "Batch Convert Orig" in MuseScore on the new .mscz to .musicxml and .png)? Looking forward to your answer, thanks

sachindae commented 2 years ago

@Shayebh With MuseScore, multiple pngs should automatically be generated using that script. There should be 1 png created for each page of the MuseScore file. What's the behavior that you are seeing?

Shayebh commented 2 years ago

Reply so fast! I understand what you mean, mainly because I don't see the part of the code that generates the train data in pipline, I tried to run "genlabels.py" , and got several '.semantic' of segmentation, but png didn't find how to segment。

sachindae commented 2 years ago

@Shayebh Thanks for pointing that out, I just realized I forgot to include the last step of generating data -- running "genlabels.py". For the PNG files, please check the Instructions section of the README in the "label_gen" folder for setting up the MuseScore plugins. In short, to generate the PNG files, you'll need to use MuseScore, and run a plugin for that (provided in this repository) to generate PNG (image files) from .mscz (MuseScore files). Please let me know if there's anything else that's unclear!

Shayebh commented 2 years ago

Haha,it's trivial! Yes ,I have used MuseScore and its plugins. Currently I only use one file to test the entire data generation process. job.json: [ { "in": "100001.mscz", "out": ["100001.xml","100001.png"], "plugin": "batch_convert_og.qml" } ] Use the command line to run mscore3 -j job.json

If only get one image per page, how to segment the image by line to match the segmented '.semantic' file? Maybe I didn't read the code carefully enough ^_^

sachindae commented 2 years ago

@Shayebh So the purpose of resizing the MuseScore file (one of the earlier steps in the process of generating labels) is so that there can only be 1 line per page. This makes it straightforward to match the .semantic files to .png files since the .semantic files can look for page breaks in the .musicxml to know when it needs to start outputting a new file for the next page/image.

Shayebh commented 2 years ago

Thanks,I found the error,but haven't found a solution yet。When I run the Pipeline 1 , I can't paginate according to the line, and the generated is still a complete musicxml。But I can use musiccore software for pagination, by adding plugin。