I just read a bit about phonological theories on onsets in different languages. For lexibank and in general, we might want to have a simplified analysis of a syllable structure. The key would probably be to use the prosody output, but to couple it with a simpler information on consonant and vowels and to create a branching representation:
onset
pre-initial consonants
initial consonants
offset
nucleus (vowel or dipthong)
coda
primary offset
additional offset
suprasegmental information
It might be useful to make a Syllable class for this purpose, that is instantiated by a string, which should be a syllable (not a word).
Even a word class would be very useful, at least I realized this in experiments on lingpy3 and new models there.
a word consists of morphemes, morphemes are a sequence, like our words in linse, which I'd re-name in form or morph, so a word is always a list of lists
a morpheme can be parsed into syllables, one morpheme can represent one syllable, or more
The classes are in fact lightweight (syllables aren't, the word yes): the word provides the list-of-list representation and a normal sequences representation, depending on what comparison is done (full cognates or partial cognates).
The additional syllable model would allow us to pull out CV patterns, i.e., prosodic information, from wordlists, and this in a consistent manner, that was so far just done by some piece of code I wrote on the fly.
I just read a bit about phonological theories on onsets in different languages. For lexibank and in general, we might want to have a simplified analysis of a syllable structure. The key would probably be to use the prosody output, but to couple it with a simpler information on consonant and vowels and to create a branching representation:
It might be useful to make a Syllable class for this purpose, that is instantiated by a string, which should be a syllable (not a word).
Even a word class would be very useful, at least I realized this in experiments on lingpy3 and new models there.
words
in linse, which I'd re-name inform
ormorph
, so a word is always a list of listsThe classes are in fact lightweight (syllables aren't, the word yes): the word provides the list-of-list representation and a normal sequences representation, depending on what comparison is done (full cognates or partial cognates).
The additional syllable model would allow us to pull out CV patterns, i.e., prosodic information, from wordlists, and this in a consistent manner, that was so far just done by some piece of code I wrote on the fly.