Audiveris / audiveris

Latest generation of Audiveris OMR engine
https://audiveris.github.io/audiveris
GNU Affero General Public License v3.0
1.53k stars 227 forks source link

Rest recognition #481

Closed agorji closed 3 years ago

agorji commented 3 years ago

Hello there,

I have an issue with Audiveris that some rests are not recognized automatically despite the classifier's high score. As an example, here in the first line, some rest glyphs are not recognized as an inter, although another similar rest with the same classifier score did (the screenshot is generated by v5.1, but it is even worse in the latest development branch): image

I tried to figure out why this is happening, and played around with some options to solve the issue, but neither was successful! Here you can find the original pdf: O lux beata Trinitas (Alberti, Johann Friedrich).pdf

hbitteur commented 3 years ago

@agorji There seems to be several problems in the sheet at hand:

  1. As you mentioned, some breve rests symbols, though being given a very good score by the glyph classifier, don't make their way up to the final score. Perhaps because(?) there already exists another breve rest in the same measure. I need to check this.
  2. Similar breve rest glyphs aren't correctly classified. I looked in the global samples repository, the one used to train the classifier: there are a few shapes for which we have only one sample, artificially derived from music font. Breve rest, as well as long rest, is one of these "singleton" samples. I will need additional samples to augment the sample repository. Of course, we can't be satisfied by identical glyphs, we need true diversity in these samples. Could you provide me with additional ones, if possible from non-synthetic scores?
  3. I noticed a couple of whole rests, correctly recognized by the glyph classifier as HW_REST_set (same glyph shape is used for whole and half rests, only their location with respect to staff line will tell us), do not lead to symbols in the final score. To be investigated.

Thanks for pointing this out. Stay tuned.

hbitteur commented 3 years ago

Answer to point 1:

The glyph classifier correctly recognized the breve rest, but there are additional checks performed on top of the (purely shape-based) glyph classifier. Specifically for breve rest, the symbol must be vertically centered on pitch position -1 (this means the middle between lines 2 and 3, counted top down). Similarly for long rest, expected pitch position is 0, which means middle of staff (line 3).

I checked on the Internet, for breve rest location. For example wikipedia or ultimate music theory. They confirm the constraint on vertical location within the staff.

I read also that breve rests are meant for duration of 2 bars (otherwise we would use a whole rest instead) in a 4/2 time signature. In the score at end , we have a common cut signature, that is worth 2/2. In fact, we have 4 half notes in each measure of this score, so it looks like a 4/2 signature instead of the common cut 2/2).

So, what should we do? Is the example at hand a "legal" score for common western notation that Audiveris claims to support? When I read the dates (1642-1710), I think the answer is probably no. So I propose to leave Audiveris current logic as it is, otherwise we may be faced with collateral damages (such as false breve rests). You will have to manually add the missing breve rests on your own, not a big deal after all for an old score.

I still need to address points 2 and 3.

hbitteur commented 3 years ago

Answer to point 2

I confirm that the breve rest glyphs are similar but not identical (you can check this by opening the Glyph board (right click in left column of Audiveris window):

The only breve rest symbol in global sample repository has dimensions width:0.6 height:1.0 for a font interline of 20 pixels (while your example has an interline value of 19 pixels).

This confirms that the problem lies in the lack of enough available samples to train the classifier. I propose to use representative glyphs from the score at hand, is this OK with you? (if there are copyrights on the score image, I can just extract a few glyphs but not publish the containing image, just tell me). This will fix the classifier recognition for your score, but for better results in future cases, do you have other scores available that we could use for wider diversity?

hbitteur commented 3 years ago

Answer to point 3

Whole rests candidates are not accepted by Audiveris if they are away from middle staff line and overlap vertically with other notes in the same measure. This is the case for measure 1 (whole rest just above staff 2 followed by a whole note) This is also the case for measure 4 (whole rest in lower part of staff 1, followed by a whole note)

This stems from the fact that we have "long" measures (instead of sequences of 2 "standard" measures). Symbols within such long measures seem to not comply with the rules within one "standard" measure.

As for point 1, I propose to not modify Audiveris implemented logic, but let the end-user proceed with a few manual corrections.

agorji commented 3 years ago

Thank you very much Hervé for following up on the issue and your detailed analysis.

Your inquiry for more samples: I have looked for some image-based sheets containing long and breve rests. I couldn't yet find long rests but found a couple of sheets that include breve rests. I am afraid I am not allowed to upload them on public, but I can send them to you for glyph extraction if you can please share an email address with me.

Your answer for points 1 and 3: Now it makes more sense why this is happening. I accept the weird timing setting of 4/2 despite the cut-common as a time signature. But I would say the issue is still valid because the sign could simply be 4/2 instead. Your point about the position of whole and breve rests seems legitimate to me. But I guess the references that you mentioned are just discussing the monophonic notes. Intuitively, there should be a way to indicate multiple rests in a polyphonic line. I tried to find some references addressing this issue and found these two; I guess both are from documentation of notation softwares: https://steinberg.help/dorico/v1/en/dorico/topics/notation_reference/notation_reference_voices_implicit_rests_c.html https://lilypond.org/doc/v2.18/Documentation/notation/writing-rests (Positioning multi-measure rests)

I am not sure whether their content is aligned with common western notation paradigms. But if they are, probably the constraint on the vertical position of whole and breve rests is quite restrictive for polyphonic sheets. Anyway, I am working with quite old sheets that this kind of notation is frequent, and actually helps to track multiple voices in a polyphonic piece. So regardless of the alignment, I greatly appreciate it if we can come up with better constraints for the whole and breve rests that let Audiveris support this sort of notation.

hbitteur commented 3 years ago

Thanks for your information. I'll browse the links you've listed.

I understand that current implemented restrictions may impede your work on old sheets. And at the same time, supporting them transparently may impede the work of other users.

So, I think we could define an additional processing flag which would be off by default (there are already a bunch of them in menu "Book > Set book parameters"). Do you have any suggestion for the name of this new flag? Setting this flag "on" would relax the restrictions we discussed about.

That way you could define your own way of processing scores (by default, by book, by sheet). I think this would be the safe way to go for all of us.

hbitteur commented 3 years ago

" I can send them to you for glyph extraction if you can please share an email address with me."

My private mail is

hbitteur commented 3 years ago

https://steinberg.help/dorico/v1/en/dorico/topics/notation_reference/notation_reference_voices_implicit_rests_c.html

This page speaks about showing implicit or explicit rests, but they don't say a word or example about rests longer than half rest. And our discussion was on longer rests (whole, breve, long).

https://lilypond.org/doc/v2.18/Documentation/notation/writing-rests (Positioning multi-measure rests)

This page is about how to make Lilypond print a (maxima, longa, breve) rest. It says nothing about the legal vertical positioning of these symbols.

But OK, let's assume the (transcribing) user is in the driver's seat and can decide on his own how these symbols must be checked, according to setting of the dedicated flag I proposed above.

agorji commented 3 years ago

I have sent you the samples.

Indeed having an option for separating the modes is a good idea. Thank you very much for adding that. For sure you are the best to come up with a name that is consistent with the rest. But since you have asked, I would propose "Support for unconventional rest positions".

Maybe a dumb question, but are book parameters can be set through cli options? (Like the options that you set using Tools->Options)

hbitteur commented 3 years ago

Maybe a dumb question, but are book parameters can be set through cli options? (Like the options that you set using Tools->Options)

Not that dumb, because I had myself to check in the code! :-) And the answer is:

hbitteur commented 3 years ago

I have sent you the samples.

You mean that you just sent them by email (I haven't received any mail yet) or you mean that you have already sent the samples (via the score you published in your very first post) ?

agorji commented 3 years ago

Thank you very much for the comprehensive answer. It covered any possible corner question that I could come up with setting the parameters!

Then I look forward to testing the new feature as soon as it has been added.

agorji commented 3 years ago

I have sent you the samples.

You mean that you just sent them by email (I haven't received any mail yet) or you mean that you have already sent the samples (via the score you published in your very first post) ?

By email, I meant. Let me check the receiver address again. Thanks for informing me.

Update: You should have received the email by now because the address was correct. I've resent it though.

hbitteur commented 3 years ago

Still no mail, there must be something wrong today with Audiveris domain redirections.

Update: I found your mails this morning, they had ended up in the spam tray...

agorji commented 3 years ago

Another case with eighth and quarter rests (screenshots are from Audiveris v5.1): Screen Shot 2021-04-05 at 9 08 43 PM Screen Shot 2021-04-05 at 9 09 04 PM As you can see above, in measures 42-44 and 190, rests are not assigned. I believe this has nothing to do with the classifier since the scores are high and similar glyphs in the sheet are assigned an inter. Here you can find the original sheet: Trio Sonata in G major, RV 71 (Vivaldi, Antonio)_ Guitar 1 Part.pdf

A side note, the current development version stops at STEMS step and gives java.lang.NullPointerException on the third page of this sheet. Let me know if it wasn't the case in your side and you needed any extra logs.

hbitteur commented 3 years ago

in measures 42-44 and 190, rests are not assigned. I believe this has nothing to do with the classifier since the scores are high and similar glyphs in the sheet are assigned an inter.

I think I know where the problem is in Audiveris current logic: a rest can be away from staff middle line only if there are head chord(s) rather aligned horizontally with the rest candidate. In the case at hand, we have no heads at all, but a full line of rests only.

Problem: this will be a chicken and egg problem, because rest candidates will need other rest candidates, and vice versa! (whereas heads are retrieved in HEADS step, and rests in SYMBOLS step later down the pipeline). To be solved...

For measure 190, the quarter rests were far from staff middle line and the preceding head-chord was to low vertically. By comparison, see measure 184, which is very similar to 190, except that the 2 quarter rests happened to vertically overlap the head-chords above, resulting in rests being accepted even though they don't belong to the same voice as the vertically overlapping head-chords.

This is getting messy. I feel like I should discard totally this test for rests vertical overlap with other chords, be they rest-chords or head-chords. In this score, which is in fact a 4/2 rather than a common_cut 2/2, we can have whole rests side by side with other chords, and belonging to a time slot, rather than staying alone in the middle of the measure width. In other words, the "whole" rest behaves just like another simple rest (like a half rest, for example). Only the breve rest and the long rest would stay alone in the middle of measure width. Does this make sense? And how could we name this "specific mode"? Not "unconventionalRestPositions" because it's no longer a question of vertical position but merely a question related only to whole rest. In this mode, a WHOLE REST is just a rest whose duration is 1, just like the WHOLE(_NOTE). It no longer means the "whole duration of the measure".

Update: I can suggest "partialWholeRests" for the name of this specific mode. It sounds like an oxymoron, which it is in some way!

agorji commented 3 years ago

Sorry for my delay. I haven't noticed the updates since I haven't been notified by email.

For measure 190, the quarter rests were far from staff middle line and the preceding head-chord was to low vertically. By comparison, see measure 184, which is very similar to 190, except that the 2 quarter rests happened to vertically overlap the head-chords above, resulting in rests being accepted even though they don't belong to the same voice as the vertically overlapping head-chords.

That actually seems to be true. Another observation, in measure 191 which is in fact very similar to measure 190, the rests are assigned probably just because of a vertically aligned head (luckily here with the same voice).

This is getting messy. I feel like I should discard totally this test for rests vertical overlap with other chords, be they rest-chords or head-chords.

To be honest, I like if the rests can be treated similar to heads. I understand the incentive behind the choice of having a vertically aligned verifier for moving rests. But it would be great if it can be removed or at least reduced to any alignments, including rest-rest. Then all these valid corner samples will be covered.

In this score, which is in fact a 4/2 rather than a common_cut 2/2, we can have whole rests side by side with other chords, and belonging to a time slot, rather than staying alone in the middle of the measure width. In other words, the "whole" rest behaves just like another simple rest (like a half rest, for example). Only the breve rest and the long rest would stay alone in the middle of measure width. Does this make sense?

Yes, totally makes sense. I also liked the way that you constrained breve and long rests just in terms of horizontal position and not vertical position (which translate into the line they lie in).

Plus, about the cut-common, the Wikipedia article that you just shared earlier explains that it was considered as 2:1 instead of modern interpretation as 2:2! (look at Alla breve section)

And how could we name this "specific mode"? Not "unconventionalRestPositions" because it's no longer a question of vertical position but merely a question related only to whole rest. In this mode, a WHOLE REST is just a rest whose duration is 1, just like the WHOLE(_NOTE). It no longer means the "whole duration of the measure".

Update: I can suggest "partialWholeRests" for the name of this specific mode. It sounds like an oxymoron, which it is in some way!

I suggested "unconventional rest positions" because to me it was not just about whole rests, but also meant to ignore constraints on breve and long rests as well (at least the vertical one).

If it is going to be an exclusive feature for whole rests, "partialWholeRests" is nice in the sense of contradictory that it conveys! I also like "movingWholeRests" focusing more on the fact that technically it is still just about the positions.

hbitteur commented 3 years ago

I hadn't noticed that updating a message didn't end up in a notification. This means that I should avoid appending information by updating an existing message. Sorry about that.

While modifying some old pieces of code I discovered that (certainly a long time ago) I had implemented the possibility for a whole/half rest to be away from staff middle line. So, in theory all the rest symbols are already considered as "vertically moving". Unfortunately, the code was not consistent for whole and breve/long rests so it couldn't fully work in the end for these shapes. This is now fixed, the question of vertical positions on staff is now correctly managed for all rest shapes.

I can confirm that we simply need to drive the logic about the duration of whole rests specifically. The user will be able to play with the partialWholeRests flag to tell Audiveris if, on the sheet at hand, whole rests must be considered as "measure-long rests" (like the breve and long rests) or mere rests with duration value 1.

We are converging, I should be able to push validated modifications shortly. (Do you have samples for long rest symbol? I still need a few of those)

agorji commented 3 years ago

Awesome, thank you very much. Then just a flag for interpreting the duration of whole rests should be enough. Although I myself will probably always set it to have duration 1 instead of the whole measure. I can't think of a situation that a whole rest means the whole duration of a measure but not 1, but the other way around happens in 4:2 timing system.

Edit: You probably didn't just meant the duration, but implicitly the horizontal freedom of a "non-measure-long" rest as well. Then the existence of the flag totally makes sense.

(Do you have samples for long rest symbol? I still need a few of those)

It was exhaustively hard to find some sheets containing long rests. I was about to print a fake sheet and rescan it that I came across a couple of them! I will send them to you by email.

hbitteur commented 3 years ago

@agorji I have included many rest samples from the scores you provided, and retrained the glyph classifier.

I also provided a way to interactively align more than 2 chords in the same time slot, simply use a lasso around the target column of chords, and open the popup Chords menu.

You will notice the new "partialWholeRests" flag is implemented. In your initial example (O lux beat Trinitas), you can set this flag at book level (and unset it just for sheet 3). I had to manually change the common_cut time signatures for explicit 4/2 signatures, and then proceed to a few manual corrections.

Your whole 5-sheet book is now correctly transcribed.

You can pull again and build from development branch.

agorji commented 3 years ago

Awesome, you did a great job! The transcription of both sheets looks good now.

Thank you very much.

agorji commented 3 years ago

I have noticed a weird bug in conversion of some measures with breve rests to xml file. Here, as an example, notice the ignored notes of second staff of measure 11 in O lux beata Trinitas (Alberti, Johann Friedrich).pdf: Screen Shot 2021-04-26 at 8 50 39 PM Screen Shot 2021-04-26 at 9 22 36 PM

Plus, "display-step" and "display-octave" of measure long rests are not stored in the xml file. This makes sense for monophonic staves but restrains the representation of polyphonic staves (e.g. staves of the given sheet above). So I was wondering whether measure long rests can have those two attributes as well as shorter rests.

hbitteur commented 3 years ago

Here is what I could see in MuseScore (after export from Audiveris latest build): image Obviously, all the notes are there and correct (just a misalignment in measure 9 , second staff, second slot, to be corrected).

For this score, I had set the book parameter "Support for partial whole rests" to true. This might explain our differences, could you check you did that too?

I will check if display-step and display-octave are correctly handled for MusicXML export of measure-long rests. I guess the answer is no, and code is to be fixed accordingly.

hbitteur commented 3 years ago

Code fixed in latest commit 2959eaa55dce3b33b096aab6af7601521087bc63

In Audiveris: image

See rests at end of measure 10 and beginning of measure 11 in MusicXML file:

image

Then in MuseScore (note MuseScore rendering does not really "obey" these rest display parameters...): image

agorji commented 3 years ago

Thanks for adding display-step and display-octave. This is fixed now.

But the MuseScore problem is still there in my side. I was wondering whether you corrected anything manually or updated any other option for your export. Because you even have some recognized heads that are not recognized in my transcription (e.g. staff 1 of measure 9 and 12). "Support for partial whole rests" was set to true in my previous export and no other options were set manually.

Screen Shot 2021-04-26 at 8 50 39 PM

hbitteur commented 3 years ago

Hi Ali, Yes, some manual corrections were needed. Here is all what I did:

  1. Force reprocessing to PAGE step, I get all 16 measures with pink background.

  2. Measure 1: Remove the two common cut time sigs

  3. Replace them by custom 4/2 time sig. 7 measures in pink. Let's focus on system S3 (measures 9-12): image

  4. Measure 9: Insert missing stem + half head. Measure now OK.

  5. Measure 11: (make sure chord IDs are displayed) image Right-click + Measure 11 + Dump stack voices gives:

    MeasureStack#11
    |0       |1       |2       |3       |7/2
    --- P1
    V 1 |Ch#2929 |Ch#2931 |Ch#2930 |Ch#2921 |7/2
    V 2 |Ch#2920 |........|........|........|1/2
    V 5 |Ch#3407 ===========================|M
    V 6 |........|........|........|........|0

    Right-click + Measure11 + reprocess rhythm, gives these warnings:

    No timeOffset for HeadChordInter{#2937(0.790/0.790) stf:6 slot#2 dur:1/2}
    Measure{#11P1} Voice{#1} too long 3/2
  6. The "No timeOffset..." on a first chord (2937) is typical of a chord which should be in the first time slot but is not. So, grab all the 3 chords that should be in first time slot and use Right-click + Chords + Same time slot for all. We now get only:

    Measure{#11P1} Voice{#1} too long 3/2
  7. Voice 1 is too long because it begins with a sequence of 3 whole heads. The second whole head should not be within voice 1 but within voice 2 (like chord 2920, the half head). So, grab chords 2920 (half head) and 2931 (whole head) and use right-click + Chords + Same voice. Measure OK now.

  8. Measure 12: A whole head has not been recognized. So assign it the right shape or drag n' drop a whole head. The reprocessing gives:

    
    No timeOffset for HeadChordInter{#3420(1.000) slot#2 dur:1}
    No timeOffset for HeadChordInter{#2922(0.818/0.818) stf:5 slot#2 dur:1/2}

9. So, grab all 4 chords of first slot, then right-click + Chords + Same time slot for all. Measure OK now.
10. To be complete, there is a slur between measures 10 & 11 from chord 2919 (half) to chord 2929 (whole) which should be a tie from chord 2919 (half) to chord 2920 (half). So, drag from slur to chord 2920, and slur turns to a tie with green color of voice 2.

Et voilà !
![image](https://user-images.githubusercontent.com/22053634/116393114-885b3d00-a821-11eb-9d61-36b691a16647.png)
hbitteur commented 3 years ago

Is your input score "O lux beata Trinitas (Alberti, Johann Friedrich).pdf" publicly available. I mean, could I use it in Audiveris handbook as a typical example that users could exercise on their own?

agorji commented 3 years ago

Et voilà ! image

Thank you very much Hervé. I really appreciate the time you put into describing each step. Made me notice some features that I haven't been aware of! I verify that I got the exact fully convertible transcription as you did.

But I still don't understand why the unedited version generates incomplete xml exports. In measure 11, the model was able to recognize all the symbols correctly but didn't add them to the xml export. I thought all the extracted information, regardless of its validity, should be considered for the export. Isn't it true?

I still think there is a bug with measure-long rests that cause such an issue, and it is not the expected behavior.

Is your input score "O lux beata Trinitas (Alberti, Johann Friedrich).pdf" publicly available. I mean, could I use it in Audiveris handbook as a typical example that users could exercise on their own?

Here is the link to the file: https://imslp.org/wiki/O_lux_beataTrinitas(Alberti%2C_Johann_Friedrich)

So feel free to use it wherever you needed!

hbitteur commented 3 years ago

But I still don't understand why the unedited version generates incomplete xml exports. In measure 11, the model was able to recognize all the symbols correctly but didn't add them to the xml export. I thought all the extracted information, regardless of its validity, should be considered for the export. Isn't it true?

Let's go back to the initial status (uncorrected) of measure 11:

MeasureStack#11
    |0       |1       |2       |3       |7/2
--- P1
V 1 |Ch#2929 |Ch#2931 |Ch#2930 |Ch#2921 |7/2
V 2 |Ch#2920 |........|........|........|1/2
V 5 |Ch#3407 ===========================|M
V 6 |........|........|........|........|0 <========================================== See this line?

and processing warnings:

No timeOffset for HeadChordInter{#2937(0.790/0.790) stf:6 slot#2 dur:1/2} <=========== See this line?
Measure{#11P1} Voice{#1} too long 3/2

In other words, because the starting time of voice 6 could not be determined, the whole voice could not be processed. Regardless whether the target is to display a measure strip as above or to export a MusicXML stream, we need to have the starting time of each voice! In MusicXML, notes are written voice per voice, and a voice is not meant to start from measure time offset 0. Hence, no start time value for a voice, no export for this voice (otherwise, which start time value should we pick up?)

agorji commented 3 years ago

In MusicXML, notes are written voice per voice, and a voice is not meant to start from measure time offset 0. Hence, no start time value for a voice, no export for this voice (otherwise, which start time value should we pick up?)

Totally makes sense. Thanks for the explanation.