Audiveris / audiveris

Latest generation of Audiveris OMR engine
https://audiveris.github.io/audiveris
GNU Affero General Public License v3.0
1.46k stars 217 forks source link

incorrect number of voices recognized in choral music #363

Open PHS-wpg opened 4 years ago

PHS-wpg commented 4 years ago

Hi Folks,

I am making a score into a Music XML file. The first problem is the software sees 3 voices and the piano where there are 4 voices and a piano.

This is probably because the Tenors and basses are sharing the bass clef staff, and the notes are sung in unison at the beginning of the piece, and the voices do not split until later.

There is a simple indication in two places of the number of sung parts that might be useful in solving the issue.
1) Under the title of the song are the words "for S.A.T.B. " 2) before the fist note is sung, at the beginning of the treble clef staff are the words SOPRANO ALTO then, before the fist note is sung, at the beginning of the bass clef staff are the words TENOR BASS

I noticed that the OCR of words is done after the musical character recognition, and that may well be to help lyrics line up with the notes, however, you may want to run OCR of words twice, and use the first run through of the words to see if there is any indication of the number of voices, the second time for notations and lyrics. I am attaching a sample of the page for you to play with bridge over troubled waters page 1.pdf

MusicForStringsPercussionCeleste commented 4 years ago

Is there any way to draw in the stems on the T and B unison? If both up and down stems are present will the current build recognize 2 voices in unison?

Best wishes, RG

On Mar 4, 2020, at 10:59 PM, PHS-wpg notifications@github.com wrote:



Hi Folks,

I am making a score into a Music XML file. The first problem is th softwar esees 3 voices and tyhe pino where there are 4 voices and a piano.

The problem is probably that the Tenors and basses are sharing a bass clef staff, and the notes are sung in unison at the beginning of the piece, and the voices so not split until later. There is a simple indication in two places that could be used to solve the issue.

  1. Under the title of the song are thE words "for S.A.T.B. "
  2. before the fist note is sung, at the beginning of the treble clef staff are the words SOPRANO ALTO then, before the fist note is sung, at the beginning of the bass clef staff are the words TENOR BASS

I noticed that the OCR of words is done after the music, and that may well be to help line them up with the notes, however, you may want to run OCR of words twice, and use the first run through to set the number of voices to export to music XML files by comparing the count of the number of voices the MCR software see to the number that is implied in writing, by searching the OCR text for the the sequences of characters "SATB S.A.T.B and some variations on Soprano Alto Tenor Bass. bridge over troubled waters page 1.pdfhttps://github.com/Audiveris/audiveris/files/4290895/bridge.over.troubled.waters.page.1.pdf

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Audiveris/audiveris/issues/363?email_source=notifications&email_token=AHIKKSKM6H6X47F5CY4GJODRF4PSLA5CNFSM4LCBGLU2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4ISUWOHQ, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHIKKSKTLLXOGRXRXWBRVO3RF4PSLANCNFSM4LCBGLUQ.

PHS-wpg commented 4 years ago

Interesting idea RG. I tried, adding the notations with a pencil that turned out not to be dark enough, so, alternatively i could use caligraphy pen instead. I could also fiddle with images of the music in a graphics program, but that kind of defeats the point of MCR.

Are you familiar with the code itself?

MusicForStringsPercussionCeleste commented 4 years ago

I am not familiar with the code. I am a music teacher and conductor, my limits are Finale Sibelius and troubleshooting the XML they export.

If we are really striving for OCR, then handwriting is a reasonable fix for anything as long as the handwritten lines match the contrast and color density of the rest of the score, and are in the limits of the expected shapes, right?

thank you all for the interesting work on this problem RG


From: PHS-wpg notifications@github.com Sent: Wednesday, March 4, 2020 11:48 PM To: Audiveris/audiveris audiveris@noreply.github.com Cc: Richard Gard rickgard@hotmail.com; Comment comment@noreply.github.com Subject: Re: [Audiveris/audiveris] incorrect number of voices recognized in choral music (#363)

Interesting idea RG. I tried, adding the notations with a pencil that turned out not to be dark enough, so, alternatively i could use caligraphy pen instead. I could also fiddle with images of the music in a graphics program, but that kind of defeats the point of MCR.

Are you familiar with the code itself?

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/Audiveris/audiveris/issues/363?email_source=notifications&email_token=AHIKKSIVJPPKNPHPZR4PNZTRF4VIFA5CNFSM4LCBGLU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEN3WGNA#issuecomment-595026740, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHIKKSOSHQHV3SSG2QKE3VTRF4VIFANCNFSM4LCBGLUQ.

hbitteur commented 4 years ago

What is a voice?

I'm a poor musician, former guitarist, but when it's time to implement an OMR, you have to make clear decisions and stick to them:

We could argue that in a staff dedicated to vocal, one can only sing one note at a time, whereas some intruments like piano or guitar, can produce several notes at the same time, and thus several "voices".

In your example, the tenor/bass staff contains only one voice. The fact that this single voice is sung identically by 2 persons, if not more, is beyond the scope of the OMR software. And it cannot seriously relie on the part name text that is often found on the left side of a part, because OCR'ed texts are not reliable enough.

Bacchushlg commented 4 years ago

I agree, Hervé, although I have the same problem rather often. A good notation design should consequently use different stem directions for the 2 voices in a notation line (as it is done in the last measure). Additionally there are some textual advises like "tutti" or "solo for Altos". All these can only be corrected manually - or by a real AI-machine ;-)).

PHS-wpg commented 4 years ago

hi hbitteur,

When I refer to a voice it do mean that quite literally, a choir voice.

if you look at a typical choral score it will have staves for the two hands of the piano, and a number of sung parts, which fit the description of Soprano, Alto, Tenor, Bass. Frequently, composers will write a score that contains two or more sub-parts. Bases are often written as Baritones and Bass, Other parts are the same, and sometimes are split into more than two parts.

Traditionally, musicians have made an effort to save paper, which is why the Coda exists for example. Another odd paper-saving method is to write two parts on the same staff with tails that point upwards and downwards from a note head of equal duration. Yet another is to place three note heads on one stem, so three singers or groups of them would sing one of the three notes, because no one singer can sing three different notes at the same time to make a chord.

I am attaching a few opening bars from a piece. It was written for soprano's altos and Basses to sing, I added the tails for the tenors and the half rest quickly.

As you can see the tail of the note on the Soprano and Alto parts points in the same direction, and represents 2 different notes for different groups of singers.

The way I see it the point of OMR is to transport those paper scores into an electronic form, which suggests that the OMR has to understand the odd traditions used by musicians who were saving paper.

sample of multiple head music

Bacchushlg commented 4 years ago

Let's think about a compromise: again a new flag that enables splitting for a notation line. So a user can mark a notation line that forces the separation of chords into 2 voices (more will become really complex - although I have some of this kind, too). I such a case the notation should consequently interpreted as 2 voices over the complete line, meaning that a single note will be duplicated for both voices (including the rests!) It should not be a global flag for a complete score, because the piano tracks are mostly perfectly interpreted now. And sometimes choir scores include piano notation, too.

PHS-wpg commented 4 years ago

That's an interesting idea. I assume you are referrring to a software flag set before the OMR starts to interpret scan of the music, but do correct me if I am wrong.

I am attaching a few bars from a song that shows how complicated the timing can get between 4 parts on two staves unequal timing of parts

PHS-wpg commented 3 years ago

HI All,

I have been using audiveris quite a lot for the last month and then making adjustments in Musescore. It's been some months since I was here last, and with more experience of Audiveris, I realise why it does some of the things it does now. One of the realities is that the OCR of the text is less than perfect, and I have an idea on how to drastically change that. ask me if you're curious.

While I applaud the OMR for its astonishing accuracy and its ability to get it right most of the time, it occurs to me that maybe we are trying too hard to make it do things that it is ill-suited to do, like my original issue of "recognizing the number of vocal parts". Why not start each OMR session with a set of questions that establish the assumptions instead of forcing the computer to work them out?

I suspect any of us using OMR can easily look at a piece of music and read how many unique parts there are for a staff by observing the largest number of note-heads on a stem. For example, we could fill in a box before the OMR is launched that tells Audiveris some vital things about how the piece is written:

e.g. What form of music is this: Orchestral/ Choral / etc in a drop down menu number of Soprano vocal Lines, = 2 number of Alto vocal Line =1 number of Tenor vocal Lines =2 number of Bass vocal lines, =1 number of hands on the piano part =2 number of accompanying instruments= 1

Once the questions are answered, the OMR does what it does best, and puts notes on a staff, and it knows how many staffs there are going to be before it starts.

Would such a feature improve the accuracy of the OMR output ?

MusicForStringsPercussionCeleste commented 3 years ago

Yes

Best wishes, Rick

Live slow, sail fast.


From: PHS-wpg notifications@github.com Sent: Monday, August 24, 2020 4:01:10 PM To: Audiveris/audiveris audiveris@noreply.github.com Cc: Richard Gard rickgard@hotmail.com; Comment comment@noreply.github.com Subject: Re: [Audiveris/audiveris] incorrect number of voices recognized in choral music (#363)

HI All,

I have been using audiveris quite a lot for the last month and then making adjustments in Musescore. Its been some months since I was here and with more experience of Audiveris I realise why it does some of the things it does now. One of the realities is that the OCR of the text is less than perfect, and I have an idea on how to drastically change that. ask me if you;'re curious.

While I applaud the OMR for its astonishing accuracy and its ability to get it right most of the time it occurs to me that maybe we are trying too hard to make it do things that it is ill-suited to do, like my original issue of "recognizing the number of vocal parts". Why not start each OMR session with a set if questions that establish the assumptions instead of forcing the computer to work them out?

I suspect any of us using OMR can easily look at a piece of music and read how many unique parts there are for a staff by observing the largest number of note-heads on a stem. For example, we could fill in a box before the OMR is launched that tells the system some vital things about how the piece is assembled: e.g. What form of music is this: Orchestral/ Choral / etc in a drop down menu

of Soprano vocal Lines, = 2 of Alto vocal Line =1 of Tenor vocal Lines =2 of Bass vocal lines, =1 of hands on the piano part =2 of accompanying instruments= 1

Once the questions are answered the OMR does what it does best, and puts notes on a staff, and it knows how man staffs there are going to be before it starts.

Would such a feature improve the accuracy of the OMR output ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/Audiveris/audiveris/issues/363#issuecomment-679336033, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHIKKSIMVF73IFAPZN4VVZLSCLBINANCNFSM4LCBGLUQ.

Bacchushlg commented 3 years ago

I had a different idea in the past with finally almost the same access: split transcription into 2 phases:

After structure analysis a couple of frames could show the detected systems, note lines, lyrics ranges, chords etc. The frames might overlap of course - they just show the ranges where later on these types of elements will be looked for.

Pausing between the steps might be controlled by a flag or even automatically - in case the the analysis comes the the conclusion that the structure is not unambiguous.

hbitteur commented 2 years ago

I don't know what to do...