Audiveris / omr-dataset-tools

Reference of OMR data
GNU Affero General Public License v3.0
18 stars 5 forks source link

Nested symbols #21

Open hbitteur opened 7 years ago

hbitteur commented 7 years ago

Some symbols can be considered as containers for inner symbols. This could apply to: time signatures, key signatures, repeat signs, etc. For example, using SMuFL names, a repeatRight is an outer symbol composed of 4 inner symbols: 2 repeatDots + 1 barlineSingle + 1 barlineHeavy.

See repeatRight image and notice that we have both the bounds of the outer symbol and the bounds of each inner symbol.

Right now, Audiveris needs the inner symbols, but this may evolve in the future, based on full-context results. For widest usability, the OMR DataSet should contain both "outers" and "inners", while perserving the containment relationship.

The proposal is simple: allow a Symbol element to contain other (sub) Symbol elements. Any reading software will thus be able to pick up the info it is interested in, according to the symbol shape at hand.

Here is the rightRepeat example translated into XML:

    <Symbol interline="14" shape="repeatRight">
        <Bounds x="1705" y="2758" w="30" h="62"/>
        <Symbol interline="14" shape="repeatDot">
            <Bounds x="1705" y="2778" w="7" h="8"/>
        </Symbol>
        <Symbol interline="14" shape="repeatDot">
            <Bounds x="1706" y="2794" w="7" h="7"/>
        </Symbol>
        <Symbol interline="14" shape="barlineSingle">
            <Bounds x="1719" y="2758" w="3" h="62"/>
        </Symbol>
        <Symbol interline="14" shape="barlineHeavy">
            <Bounds x="1725" y="2758" w="10" h="62"/>
        </Symbol>
    </Symbol>
hajicj commented 7 years ago

Dear fellow OMRists,

I recently published an OMR dataset, and had to solve the same problems when designing the representation.

The nested structure orders the symbols into a forest graph, but at some point, you need to start duplicating objects: e.g., with noteheads that have two stems attached when two voices meet, or with back-to-back repeats (shared barlines in the middle). My solution was to use a general DAG, not a forest, and simply record the dependencies between the objects.

I would be happy to collaborate on creating the ground truth definitions in general.

-Jan Hajič jr.

hbitteur commented 7 years ago

Ahoj Jan,

I'm sorry for such a late answer, being on vacation right now, with very very poor Internet access. Reading and / or writing a mail takes ages.

I will get back to you when I can really look at the URLs you mentioned (when I return within a week or so).

I'm very interested to discuss these points with you, and collaborate if possible on this ground truth material.

Sorry for the additional delay.

/Hervé

On 18/08/2017 18:25, Jan Hajič jr. wrote:

Dear fellow OMRists,

I recently published an OMR dataset https://ufal.mff.cuni.cz/muscima, and had to solve the same problems when designing the representation.

The nested structure orders the symbols into a forest graph, but at some point, you need to start duplicating objects: e.g., with noteheads that have two stems attached when two voices meet, or with back-to-back repeats (shared barlines in the middle). My solution https://arxiv.org/pdf/1703.04824.pdf was to use a general DAG, not a forest, and simply record the dependencies between the objects.

I would be happy to collaborate on creating the ground truth definitions in general.

-Jan Hajič jr.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Audiveris/omr-dataset-tools/issues/21#issuecomment-323398990, or mute the thread https://github.com/notifications/unsubscribe-auth/AVCDAlCcnKHZCmmDVXfJaK6L4INCaLLzks5sZbrxgaJpZM4NeAtY.

-- Hervé Bitteur

Audiveris OMR www.audiveris.com http://www.audiveris.com


L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus

hbitteur commented 7 years ago

Finally, I have been able to access your site, late at night. Main difference is that you are aiming at handwritten music while here we deal with printed music only. But your user interface to annotate music entities is something we are looking for. Regarding the data model (nesting relationship vs more general relationship), there is a difference between the very simple model for OMR dataset (meant to train a classifier on entity patches) and the more complex model used internally by Audiveris (where we use graph of entities with suitable relationships). See description in the wiki part of audiveris repo, especially https://github.com/Audiveris/audiveris/wiki/sheet%23N-xml

hajicj commented 7 years ago

Don't worry about delays, I have way too much work anyway...

Thanks for the documentation link! It would be good for my data format to be compatible with the Audiveris SIG, so that you can eventually train whatever systems you have for manuscripts as well as printed music.

Re: MUSCIMarker UI - I will be happy to collaborate to make it useful to you. Feel free to create issues in the project repo with the type:question label; if you want a demo, we could perhaps arrange a TeamViewer session or something similar.

Have a great vacation!

hbitteur commented 7 years ago

Hello Jan,

(I'm back to work at home now).

The data model used for the omr-dataset-tools repository was quickly defined during the days that followed Salzburg musical hackday. It is primarily meant to allow the training of a "patch" classifier on a dataset as large as possible. A "patch" classifier works on fixed-size sub-images (hence the name: "patch") centered on musical symbols in a properly scaled image. For the training, we don't remove the staff lines, we don't try to figure which precise pixels compose the symbol at hand, it's the job of the automatic training of the classifier to pick up relevant "contextual" information in the patch pixels.

I personally have very high expectations on the use of such a patch classifier, precisely because at run-time it needs no prior staff removal or glyph segmentation: simply point the classifier to a given image location, and read which symbols (if any) can be found at that location. Perhaps the classifier will be able to distinguish between say a 1-sharp key signature and a sharp alteration, based on contextual pixels, at least we hope so, and we defined different symbol names for that purpose.

Will a patch classifier work on primitive symbols or on composite symbols? For example, in a repeat sign, should the classifier recognize each dot separately, or directly the pair of dots as a whole? Or the whole sequence of dots + thin barline + thick barline? We have left this option open, and for the widest possible use of the dataset, we encourage the use of nested definitions in that case.

A very different work is the OMR itself. The ".omr" file of an Audiveris project is meant to carry all relevant information the OMR has been able to collect. When comparing with your 4 major steps in MUSCIMA++ description, this must be very close to the information you have at the end of step 3 (recovering the logical structure of the score). Strictly speaking, we could consider that your step 4 (final representation to MusicXML, MEI, MIDI, LilyPond, PDF, you name it...) is not part of OMR, but is simply a (partial) export using some given format.

The ".omr" file is just the marshalling of data in memory, and the data model is centered around the notion of SIG (symbol interpretation graph) for each system. A SIG is a DAG, with interpretations as vertices and relationships as edges. Granted, there are still a few nesting cases here and there, mainly for historical reasons, and right now I'm considering replacing some of them with standard relationships (because it will ease the implementation of undo/redo user actions). So, yes, the SIG model is what should be compared with your MUSCIMA model, and the discussion is likely to be very fruitful.

The data model in the omr-dataset-tools is very poor compared to the SIG model, and actually the "annotation" is just no more than a (limited) export of .omr project file. Also, if we discover that we have overlooked something in this dataset, then it should not be a big deal to correct the annotation method and re-run the export on the OMR sources.

This is indeed a point where we diverge, because of your focus on manual scores and mine on printed scores: I can always re-run the export, while in your case, you cannot ask your human writers to redo their work! :-)

Best regards, /Hervé

On 18/08/2017 18:25, Jan Hajič jr. wrote:

Dear fellow OMRists,

I recently published an OMR dataset https://ufal.mff.cuni.cz/muscima, and had to solve the same problems when designing the representation.

The nested structure orders the symbols into a forest graph, but at some point, you need to start duplicating objects: e.g., with noteheads that have two stems attached when two voices meet, or with back-to-back repeats (shared barlines in the middle). My solution https://arxiv.org/pdf/1703.04824.pdf was to use a general DAG, not a forest, and simply record the dependencies between the objects.

I would be happy to collaborate on creating the ground truth definitions in general.

-Jan Hajič jr.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Audiveris/omr-dataset-tools/issues/21#issuecomment-323398990, or mute the thread https://github.com/notifications/unsubscribe-auth/AVCDAlCcnKHZCmmDVXfJaK6L4INCaLLzks5sZbrxgaJpZM4NeAtY.

-- Hervé Bitteur

Audiveris OMR www.audiveris.com http://www.audiveris.com


L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus

hajicj commented 7 years ago

Hello Hervé,

this usage of nesting makes sense, indeed - basically to define what your lowest-level symbol is.

On the subject of Music Hack Days -- Will you be at the upcoming Waves Vienna?

I'm currently improving MUSCIMarker for a new round of my own annotations -- I need to start dealing with grayscale images, poor-quality photos, etc., and I'm also extending the data format to deal with temporal relationships. Will you be using MUSCIMarker? If yes, now (=1st half of September) would be a good time to check it out and tell me if you need some changes, since I'm spending time on it anyway.

The offer of a real-time MUSCIMarker demo session still stands, by the way.

Best, -Jan

hbitteur commented 7 years ago

Hi Jan,

We'd better interact by private email now, to avoid polluting the public pages of omr-dataset-tools.

You can join me on: herve.bitteur at audiveris dot org

/Hervé

On 07/09/2017 15:43, Jan Hajič jr. wrote:

Hello Hervé,

this usage of nesting makes sense, indeed - basically to define what your lowest-level symbol is.

On the subject of Music Hack Days -- Will you be at the upcoming Waves Vienna http://www.wavesvienna.com/hackday/?lang=en?

I'm currently improving MUSCIMarker for a new round of my own annotations -- I need to start dealing with grayscale images, poor-quality photos, etc., and I'm also extending the data format to deal with temporal relationships. Will you be using MUSCIMarker? If yes, now (=1st half of September) would be a good time to check it out and tell me if you need some changes, since I'm spending time on it anyway.

The offer of a real-time MUSCIMarker demo session still stands, by the way.

Best, -Jan

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Audiveris/omr-dataset-tools/issues/21#issuecomment-327803361, or mute the thread https://github.com/notifications/unsubscribe-auth/AVCDAqc-otVqYsp0IuR3SZcLJHyPSq13ks5sf_MLgaJpZM4NeAtY.

-- Hervé Bitteur

Audiveris OMR www.audiveris.com http://www.audiveris.com


L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus