DDMAL / volpiano-display-utilities

MIT License
0 stars 0 forks source link

Handle sometime consonant group "nc" #43

Closed dchiller closed 9 months ago

dchiller commented 9 months ago

Closes #42.

It turns out that "nc" is only sometimes a consonant group. It is treated as a group in sequences of three or more consonants ("sanctus" -> "sanc-tus") but not when on its own ("principem" -> "prin-ci-pem")

annamorphism commented 9 months ago

this sounds right--I suppose basically it's nasalizing the vowel rather than acting as a true consonant, so sanctus=sãctus=sãc-tus but pri˜-ci-pem per the usual syllable division rules. Are there other groups that work this way (re-demp-tor but im-pe-ra-tor?), and are they already accounted for?

jacobdgm commented 9 months ago

checking this locally,

>>> from volpiano_display_utilities.cantus_text_syllabification import syllabify_text
>>> syllabify_text("redemptor")
[SyllabifiedSection: [['re-', 'demp-', 'tor']]]
>>> syllabify_text("imperator")
[SyllabifiedSection: [['i-', 'mpe-', 'ra-', 'tor']]]

"Imperator" doesn't conform with @annamorphism's syllabification. Suggesting you address this too before merging

jacobdgm commented 9 months ago

Is this a liquid-consonant–plus–consonant thing? Are there l<consonant> clusters that should be syllabified l-<consonant>, or similarly for r<consonant>?

dchiller commented 9 months ago

redemptor vs imperator is a good catch...thanks @annamorphism and @jacobdgm for checking.

dchiller commented 9 months ago

Is this a liquid-consonant–plus–consonant thing? Are there l<consonant> clusters that should be syllabified l-<consonant>, or similarly for r<consonant>?

There's nothing currently in the logic that works with liquid consonant + consonant...I'm also not immediately thinking of any examples. Do you have one?

dchiller commented 9 months ago

I'm also not immediately thinking of any examples.

Ah...so liquid consonant + h I have, but in this case we already don't treat the 'h' as a consonant.

dchiller commented 9 months ago

@annamorphism I'm just noticing one other previous decision I made that I don't know if I stand by...

In the following cases, do you prefer column A or column B or mixed?

Word A B
iusticie iu-sti-ci-e ius-ti-ci-e
fuisti fui-sti fuis-ti
xpistus xpi-stus xpis-tus
maiestatis ma-ie-sta-tis ma-ies-ta-tis
annamorphism commented 9 months ago

I just got out my old linguistics textbook (!) which gives a universal rule: syllables should, all else being equal, divide into open syllables and syllables with onsets (aka "onsets are greedy"). So column A, which creates open syllables followed by the "st" onset, is preferable to column B, which creates closed syllables instead. (Fuisti should also probably be three syllables, but that's a separate issue.)

dchiller commented 9 months ago

So column A, which creates open syllables followed by the "st" onset, is preferable to column B, which creates closed syllables instead.

Ok, thank you! That is what intuitively makes sense to me now...I'm not sure why I did it the column B way earlier. So will add to this PR a fix for that.

Fuisti should also probably be three syllables, but that's a separate issue.

The good news is that this was a typo here and that's already the way it is handled!

dchiller commented 9 months ago

Yes, I was confused by your request. I thought you meant it generally, but then couldn't find anywhere else that I would want to annotate (I added it to the definition of num_ltrs_btw_vow_grps because that's the one where the best case could be made I think).

I just added another one to num_consonants...I could again see one there because it is based off of a defined variable a few lines up, although my natural inclination is that this is uneccessary.

The other two cases you mention are clearly initialized as their appropriate type, so an annotation seems superfluous.

jacobdgm commented 9 months ago

The other two cases you mention are clearly initialized as their appropriate type, so an annotation seems superfluous.

Is the plan that Volpiano Display Utilities be fully mypy compliant? I think I had assumed that this was the plan from the beginning, in which case I don't think that any of the variables can be left unannotated.

Without reference to mypy, I think it's worth annotating all the variables, even the obvious ones, at least within a given function: If I see an unannotated variable in a context where the other variables are annotated, I'm likely to think, "oh, this isn't annotated, so the variable must have been initialized somewhere else", and then go looking for where it was defined. I can see that syl_bound is an integer at the moment it's defined, and I'm pretty confident that it wasn't initialized earlier by scrolling up and finding a docstring a few lines above, but I can't be confident that some string, say, hasn't been assigned to it somewhere later in the function.

dchiller commented 9 months ago

Is the plan that Volpiano Display Utilities be fully mypy compliant? I think I had assumed that this was the plan from the beginning, in which case I don't think that any of the variables can be left unannotated.

It is currently (and was without any of the variable annotations recently added starting with 70bc74e) mypy compliant.

I can't be confident that some string, say, hasn't been assigned to it somewhere later in the function.

I'm not sure I understand this. How would a type annotation ensure that a string hasn't been assigned to it later in the function? Or are you saying if there was a type annotation you would know that a string assignment later in the function (if such an assignment occurred) is wrong.

dchiller commented 9 months ago

I will go ahead and add them...I don't think it is "wrong" if they are there. This is, I think, just different point of personal preference in terms of code readability and clarity (I used mypy to catch lots of errors while actually writing this, but that just requires mypy compliance to be useful). For those purposes, if there is something unclear in the case of those two variables, a comment or better name would be more useful than a type hint. If it clarifies things for you, though, then it's all good.

dchiller commented 9 months ago

Yes, this.

Makes sense. Mypy will alert you to this without the annotation, but if you are looking at it mypy-less, I see how this is useful.

maybe having return annotations for all the functions is sufficient for a less-strict setting, whereas all variables would need to be annotated on a stricter setting.

There are definitely varying levels of strictness (I've started using --strict but haven't taken the time to check whether there are even further "strictness" levels). Again, I don't know that typing all variables would ever be necessary, but there may be some setting that does want that for some reason.