sillabify lines for poetry

francescomambrini commented 7 years ago

I love this library: apart from being awesomely written, it is very useful in plenty of applications and teaching activities! With a slight extension, the sillabify module can be used to teach Greek meter as well. Along with the suggested functionality, I give you the full explanation below. Apologies if the post is very long, but I think that a bit of context might help.

In Greek poetry, syllables are scanned in a continuum called synapheia. In other words, the whole line, not the single word, is the string to syllabify. So for example Eur. Medea 1:

Εἴθ᾽ ὤφελ᾽ Ἀργοῦς μὴ διαπτάσθαι σκάφος

Becomes:

ειθωφελαργοῦςμηδιαπτασθαισκαφος

If syllabified with the current method, this long string in synapheia yields:

['ει', 'θω', 'φε', 'λαρ', 'γοῦς', 'μη', 'δι', 'α', 'πτα', 'σθαι', 'σκα', 'φος.']

It should be:

['ει', 'θω', 'φε', 'λαρ', 'γοῦς', 'μη', 'δι', 'απ', 'τασ', 'θαισ', 'κα', 'φος.']

The source of the problem is that the current module works (nicely!) with the ordinary scansion rules for single words: every consonant cluster that can be found at word's onset (e.g. 'σκα') is grouped together. This doesn't apply in poetic scansion. The following is the list of the valid consonant cluster in poetry:

        "δρ",
        "θλ", "θν", "θρ", "θμ",
        "κλ", "κν", "κρ",
        "πλ", "πν", "πρ",
        "τρ", "τμ", "τν",
        "φλ", "φρ",
        "χλ", "χρ"

By the way, Homer and the Attic tragedians use notoriously different rules for the consonant clusters muta cum liquida ("κλ", "κρ" etc.). Thus, it might be a sound design choice to turn this list into an argument that users may pass to the function... But that's an extra!

jtauber commented 7 years ago

We could allow passing in the list of consonant clusters as an argument and define a constant for each of the common lists. What should we call each of these sets? BASIC, POETIC, HOMERIC?

jtauber commented 7 years ago

So, in other words, we'd have

BASIC = [
    "βδ", "βλ", "βρ",
    "γλ", "γν", "γρ",
    "δρ",
    "θλ", "θν", "θρ",
    "κλ", "κν", "κρ", "κτ",
    "μν",
    "πλ", "πν", "πρ", "πτ",
    "σβ", "σθ", "σκ", "σμ", "σπ", "στ", "σφ", "σχ", "στρ",
    "τρ",
    "φθ", "φλ", "φρ",
    "χλ", "χρ",
]

POETIC = [
    "δρ",
    "θλ", "θν", "θρ", "θμ",
    "κλ", "κν", "κρ",
    "πλ", "πν", "πρ",
    "τρ", "τμ", "τν",
    "φλ", "φρ",
    "χλ", "χρ",
]

jtauber commented 7 years ago

then syllabify would take an option arg for consonant clusters, e.g.

from greek_accentuation.syllabify import syllabify, POETIC

syllabify(word)  # same as syllabify(word, BASIC)
syllabify(word, POETIC)

francescomambrini commented 7 years ago

I think that this would work very well! Tomorrow I will double-check the list of clusters allowed in poetry and Homer

jtauber commented 7 years ago

@francescomambrini were you able to double-check the list of clusters for POETIC and HOMER?

jtauber commented 6 years ago

@francescomambrini any update on this? I think I can proceed just with the POETIC if that list looks right and can add HOMER when you get a chance.

francescomambrini commented 6 years ago

Yes! Homer's prosody is complicated by several historical phenomena, like the preservation of digamma, some anomalous consonant redoubling (e.g. ὥςτε λίς > ὥς-τελ-λἰς), or etymological syllabification. Those cases are basically impossible to catch in a function. In practice, however, for the most part no consonant clustering is needed in Homer. I think the most straightforward way is to give another option, along with PROSE and POETIC, named NO_CLUSTER (empty list). Then users might deal with the exceptions case by case: this should work with Homer. Maybe there is a more sophisticated way to deal with Homer, but I didn't find it...

jtauber / greek-accentuation

sillabify lines for poetry #5