Open JoyfulGen opened 1 year ago
The test file used in the ci tests for mei encoding has divLines within syllables. Here is a snippet from the test file
<syllable xml:id="m-7f2139be-e50f-42c7-8147-c5bb926657cc">
<syl facs="#m-5d0872d6-06e0-4500-8b4f-303277f6bc6d" xml:id="m-00e7c517-8cd5-45d3-9c67-5e68ec55f54a">do</syl>
<neume xml:id="m-4100596a-55d7-4448-8891-7206115e8897">
<nc facs="#m-d8f5f9f1-663e-4908-9d11-47ec1ffed916" oct="2" pname="g" xml:id="m-20606436-2645-4562-9c14-da6871b29283"/>
<nc facs="#m-79e11d2f-bebc-41e3-9401-60fb03a6958f" oct="2" pname="a" xml:id="m-f15e06f2-e1ea-43c9-bcf7-904335d5ad58"/>
<nc facs="#m-de489579-5e8f-4867-8e78-09f13c0ba5cf" oct="2" pname="a" tilt="s" xml:id="m-95cd5255-52ee-41aa-8869-d3480e18c812"/>
</neume>
<divLine facs="#m-56108990-7810-412f-827d-ba1a654e5ef1" form="maxima" xml:id="m-88514859-c74d-4c2b-bbd1-727a1007f43e"/>
<neume xml:id="m-a0c8cbcf-f5bc-4ac0-91b3-3e48363a46ce">
<nc facs="#m-ed266e5e-68b0-4912-9992-0d98bfc81422" oct="2" pname="a" tilt="s" xml:id="m-ffba86ac-46ea-45fd-8a31-a84dff2ded32"/>
</neume>
</syllable>
I recently changed the test file for changes with multi column jobs. To make sure I didn't introduce the bug, I looked at the test file from earlier this summer and found this snippet.
<syllable xml:id="m-674a87bf-c478-4b25-b171-b0c59f1085f4">
<syl facs="#m-5e41ad1c-0199-4f9d-a345-00e9fc8ca974" xml:id="m-3893f402-0c84-4270-b8b4-c9d61b354012">do</syl>
<neume xml:id="m-65c26478-a9ad-4426-b66a-74bfe42d0eb4">
<nc facs="#m-89a4b857-9f48-490e-9a61-0e0365c4124b" oct="2" pname="g" xml:id="m-0fc65eac-3f4f-4f6f-8c02-3b189c543724"/>
<nc facs="#m-ff3abeea-620d-4113-9a63-91b0df71fc1b" oct="2" pname="a" xml:id="m-486d0729-7f91-46bf-ab68-cc256170cc0f"/>
<nc facs="#m-3554166e-b043-485a-aec3-5407f8bdf3f2" oct="2" pname="g" tilt="s" xml:id="m-1f7e1204-a33b-4376-9ccd-eabe40ab57f3"/>
</neume>
<divLine facs="#m-37dc8e5a-8abf-4efb-ac5e-93a8aecc289c" form="maxima" xml:id="m-cf7fbfbf-b27e-4476-8700-ef1309c9050a"/>
<neume xml:id="m-8bcf5bd2-d160-414d-9f87-4424c59ad5be">
<nc facs="#m-e543231d-b896-419c-a1bf-6625feebd765" oct="2" pname="g" tilt="s" xml:id="m-c224faff-c713-4251-bb0b-198cad816bd9"/>
</neume>
</syllable>
Going back to before this summer, this behaviour is not present in the test file.
I remember issue #634 however we changed this at the beginning of the summer as a result of this comment. Because of this we reverted it to clefs and accidentals being allowed in the syllables. Is this correct?
Also there were some changes made to MEI encoding earlier this summer that are probably the cause of this but I was just wondering, should these be considered empty syllables if they all have the follows tag?
I'm not sure what the rule is in this case but from what I understand, a syllable is empty when it doesn't have a neume in it but since these are all syllables linked by a precedes and follows are these empty? We can have a system that will take these divLines but I just wanted to double check.
I remember issue https://github.com/DDMAL/Rodan/issues/634 however we changed this at the beginning of the summer as a result of this https://github.com/DDMAL/Rodan/issues/784#issuecomment-1363275430. Because of this we reverted it to clefs and accidentals being allowed in the syllables. Is this correct?
Clef, accidentals, and divisios can indeed be in syllables, but from a user's perspective it currently takes much less time to insert them myself as needed than to remove the ones already in a syllable in the fresh-outta-OMR file. I think Ich eventually wants a more long-term solution than just never inserting those elements in syllables though, so I'll stop complaining.
Also there were some changes made to MEI encoding earlier this summer that are probably the cause of this but I was just wondering, should these be considered empty syllables if they all have the follows tag?
I would say yes, but @yinanazhou can confirm. From what I know, a syllable has to have at least one neume with a pitch, so a single divisio can't be a syllable. Also, when correcting on Neon, having a syllable that follows or precedes an empty syllable creates great potential for chaos. It'd be great to not have them.
Clef, accidentals, and divisios can indeed be in syllables, but from a user's perspective it currently takes much less time to insert them myself as needed than to remove the ones already in a syllable in the fresh-outta-OMR file. I think Ich eventually wants a more long-term solution than just never inserting those elements in syllables though, so I'll stop complaining.
@JoyfulGen I'm a little confused now, should I continue to work on this issue?
@MrMondrian my bad! I think part of the problem is there are two issues in one:
Divlines inserted into syllables From what I understand, this is working as it should. If there's a divline between two neumes that are part of the same syllable, mei_encoding inserts that divline into the syllable. This is correct. It looks incorrect if the syllables have been assigned wrong, but that's not the issue here. I flagged this in the first place in case it was unintentional, but since it was not, then it's all good. This issue doesn't need work.
Empty syllables This one does need to be fixed, if possible. Basically, syllables should always contain at least one neume component. A syllable shouldn't be only a divline/clef/accidental, even if it precedes
or follows
another syllable. But I'm finding syllables like this in the new files and they're creating chaos! It would be great if empty syllables (ie syllables that don't contain at least one neume component) did not exist.
@yinanazhou @sabrina0822 please tell me if I got something wrong.
Okay I'll set aside the divLine issue for now and start on the empty syllable issue. I don't think it makes sense for the solution to the divLine issue to be to remove them from syllables. If mei encoding is given good input, it should produce good output. As such I believe the best way to approach that issue is to work on improving its performance
Could your add you hpf and text alignment inputs so I can test my changes? @JoyfulGen
Earlier this week, I processed Einsie folio 129r on rodan_staging to test pitch finding accuracy. Pitch finding was much improved (hurray!), but two old issues that were closed a while ago popped up again, namely:
https://github.com/DDMAL/Rodan/issues/629: the "empty syllable" problem, where a syllable has no neume. Neon now alerts the user to this problem and there's a handy shortcut to deleting the empty syllables in one click, but they still tend to create havoc. I hadn't seen any empty syllables since last summer.
https://github.com/DDMAL/Rodan/issues/634: the conclusion to this issue was that divLines and clefs would always be pulled out of syllables. In the few instances where they should actually be in a syllable, the user would take care of it. However, in the new 129r a bunch of divLines were inserted into syllables when I first opened the file. This also had not happened since last summer.
The file, for reference: 129r new.mei.zip The empty syllables have the following IDs: m-1561c062-370b-4fae-979f-9a404bf0af92 m-39864741-54cd-4c78-8aa1-ab576dedb0cb m-83943032-4652-4e45-92dd-3ad03bee8a03 m-84501ecc-9379-43a8-97ae-bb1e8bcb5b5d
And this is the old 129r file, ie the one with not-so-good pitch finding, but no empty syllables or divLines inserted into syllables: 129r.mei.zip