Audiveris / omr-dataset-tools

Reference of OMR data
GNU Affero General Public License v3.0
18 stars 5 forks source link

Extensive list of classes #34

Open kwon-young opened 6 years ago

kwon-young commented 6 years ago

Hello, I'm a PhD student at Rennes working on OMR in combination with deep learning and syntactical methods. I have started to use MuseScore with @nasehim7 imeta branch in order to produce annotation of musescore files. The end goal would be to train a deep learning detector like Faster R-CNN in order to do the detection of music symbol in printed scores. One of the problem I have encountered is that there is no extensive list of class names MuseScore can produce. The only thing that I've found is this file: https://github.com/Audiveris/omr-dataset-tools/blob/master/src/main/java/org/audiveris/omrdataset/api/OmrShape.java

One of my objective here is to come up with a logical set of class names for the task of detection of music symbols. I already did a previous work in collaboration with @apacha: https://github.com/apacha/MusicObjectDetector-TF but on handwritten music scores with a class set detailed by the muscima++ dataset. While I know I won't be able to use the same class set for printed scores, I would like to have a similar class set in order to relate to my previous work. I have already generated a bunch of annotations with public domain musescore files and did a comparison between the classes generated, classes listed in OmrShape.java and the class set of muscima++ dataset. I found some missing classnames in OmrShape.java and a typo in MuseScore generation code: flag64thDown is generated as flag64tjDown. If anyone is interested, I will be very happy to discuss this and contribute to this project!

My last question is about data. On what data is https://github.com/Audiveris/omr-dataset-tools testing its implementation? Does anyone have more information relative to the license of musescore files in the context of research work and training deep learning models?

I think this is an awesome project and I would like to thank everyone working on this!

nasehim7 commented 6 years ago

Hi @kwon-young We are happy that you are excited about this project, we are too. Actually, we have worked on making the basic infrastructure for the OMR task almost done. We are currently into the testing phase of the functionality and refining the codebase as per the results. It will surely take time. Our next task is the same which you said above - Adding more data to Audiveris(OmrShape.java) so that the shapes that audiveris understands is in line with the shapes Musescore XML presents it with. Moreover, Thanks for pointing the typo, we are into getting it better. There were several new kinds of implementations which were needed to be done to the OMR work and they are almost over, so now we are checking and making edits as per the requirements. Me, @lasconic and @hbitteur are discussing and testing stuff to understand what can be the best structure for our work. Special Thanks to @wschweer, in initiating the implementations and giving a good guideline to me in the codebase. For the rest, core contributors are better to offer a brilliant and more convincing answer. Good Luck with your research work @kwon-young. :)

kwon-young commented 6 years ago

Thanks for your reply! I would like to contribute back my work of analysing class names produced by musescore. (To be honest, it was a really boring work and I hope that nobody should do this, again ...) Should I post a list of all the missing class names here so that we could discuss the interest of each class name in the context of OMR? I can also help implementing some missing classes in musescore.

lasconic commented 6 years ago

Hi @kwon-young, Sure do share your findings and let's see if we can improve on this point. So far we tried to follow SMuFL.

lasconic commented 6 years ago

Regarding license, using CC0 scores is safe for any use case, including OMR

nasehim7 commented 6 years ago

@kwon-young Yes, It will be cool if you can share your work with us.

kwon-young commented 6 years ago

Okay, I'll try to compile this in a table shortly. About the license, I have an api key to search for musescore files, but the search field for license only allows for these values: to_modify_commercially, to_use_commercially, to_share, to_play Which one are relevant to the CC0 license? Does CC0 contains publicdomain license or is it something different?

lasconic commented 6 years ago

to_modify_commercially should contain CC0 content. See https://en.wikipedia.org/wiki/Creative_Commons_license#Zero_/_public_domain