Get the symbol localization ground-truth

kwon-young commented 7 years ago

Hello, First, I would like to thank to have ported this project to github. I'm phd student at the IRISA lab in Rennes/France and I'm currently working on optical music recognition using deep learning method. However, these methods require a heavy amount of data and i'm searching possible tools that will help me construct a dataset containing the ground-truth localization of music symbols contained in a music score. From what i've experimented of audiveris, it seems that it can give some fairly good results and could help me grow this dataset even faster. But I need to extract the physical position (or bounding box) of music symbols contained in music scores from the internal data structure of the java code. I was wondering if you could give me some advice as to where to start in order to extract those information from the code ?

Thank you in advance for your help !

jlpoolen commented 7 years ago

I'm not sure what you mean by "bounding box." I am guessing that once a glyph is matched, then you would have a bounding box. But getting to the point of matching glyphs is further into the process and occurs after the marshalling of staff lines and bars. And I do not know what you mean by "ground-truth localization."

I found there are problems with simply identifying staves on a pages. That involved understanding the "runs" tables and the models and logic to determine what are staff lines and bar lines. I think there were some shortcomings in the way both tables of runs were assessed insofar as one a black pixel was identified to be part of a bar, it did not qualify for a staff line. Or vice-versa -- it's been several months since I've been immersed into this complicated code.

I'd start with mastering an understanding of the "runs" analysis. The Java node: omr.run

I did a dump of my working notes (from KeepNote) to: http://editionspoole.com/Audiveris-2016-12-23/ You may want to look at the ProjectGlossary page first

kwon-young commented 7 years ago

By bounding box, i mean the rectangle that contains a music symbol or what you call glyph. I saw in your project glossary that it is called Absolute Countour in the code.

I'm using the term ground-truth because i'm doing machine learning where the ground-truth data is the output data contained in the dataset as opposed to the output data predicted by a model.

What i would like to extract from the internal java structure of the code is the localization (or bounding box) of every glyph recognized in the current music score page. The fact that the system is making mistake is not a problem as this is one of my phd problematics !

In any case, thank you a lot for all your notes and advice.

Another thought, i saw that during runtime, you can open a debug window containing all the data structure of the music score recognized. Is there a way to dump all that information into a txt file ? I could then extract all the information i need by parsing it.

jlpoolen commented 7 years ago

I'm not aware of any such dump. My modifications to Audiveris included the ability to dump in XML the stave and measure coordinates. You may want to study that and possibly apply it to the whatever accumulator there is for glyphs.

kwon-young commented 7 years ago

Yes, that would be very useful ! Could you pinpoint me to the relevant part of the code ?

jlpoolen commented 7 years ago

It would take me as long to do so as it would you. Therefore, you'll have to do it yourself.

kwon-young commented 7 years ago

Sorry, yes i found it myself

jlpoolen / libreveris

Get the symbol localization ground-truth #9