DDMAL / Rodan

:dragon_face: A web-based workflow engine.
https://rodan2.simssa.ca/
47 stars 13 forks source link

[Text_alignment / OCR] Syllables not being picked up for MS73 #1215

Open JoyfulGen opened 1 month ago

JoyfulGen commented 1 month ago

UPDATE: This might be in part due to a user mistake (here we are again), so please hold!

I've been starting to run some e2e OMR workflows with MS73 folios and the text part of the process doesn't seem to be working. Normally, the original image is separated into layers, and the text layer gets sent to the Text_Alignment job, which uses OCR to roughly find the syllables and then match them with the correct text that we provide. In Neon, the syllables will look like this (this is a Salzinnes folio):

Salzie good syllables example

However, this process doesn't seem to be working for MS73. So far, @kyrieb-ekat got this (enjoy the numbers): Kyrie text nuggets

And I got no syllables at all:

MS73 054 no syllables

Because the syllable text is directly related to how the neumes are grouped into syllables, these errors result in the syllable groupings being completely wrong, which lengthens the correction time quite a bit.

I ran an e2e OMR workflow with an Einsie folio and the syllables were perfect, so this seems to be an MS73-specific problem. Could it simply be that the training model we've been using for Salzinnes and Einsie doesn't work for MS73? In the Text_Alignment job, the training model is built directly into the job, so I don't think this is something that I can change.

JoyfulGen commented 1 month ago

ANOTHER UPDATE: This was indeed in part due to user error (you can always count on me). I mistakenly assigned the wrong layer output to the input of the Text_Alignment job, which is why my syllables came up completely empty.

However! Kyrie did not make that mistake, so her result is accurate. I tried running a couple more workflows after fixing my mistake and I'm getting something similar. There are syllables, but they are far too few and those that are there are not correct. I'm not sure at the moment what this is due to; it's possible that as our glyph classification training data improves, the syllable problem will lessen. I'll put this issue on hold for now until we know more.

kyrieb-ekat commented 1 month ago

I'm going to also be retracing some of the previous steps done on this, and test a few more pages of MS73. Also, to look into OCRopus and see what the text_alignment thought processes for the OCR models were.