Open maryewal opened 2 years ago
Okay, we will need to standardize the annotations here, as we write pre-nasalization only with a superscript n, as the articulation follows from the sound that follows (and inconsistencies are huge). Also, if there are labiovelars in the data, we'll have to double-check that this works with the orthography profiles and have to hope they are properly transcribed here (we write kp, gb, without the bar, as we use segmented data). But these features are easy to compute.
These are 9 features, 10 would make it nicer (also for any maps and the like in any publications).
Now, we need 10 lexical features (like colexification, partial colexification, or the like). You could even have features like 3 recurs in 8 (which is an indicator of quinal systems, etc.).
Most of the colexifications in the Lexibank paper cannot be inferred from the VV data. The following should be possible:
Red and Yellow ThreeInEight HairAndFeather HearAndSmell CommonSubstringInManAndWoman
We could also do things like: CommonSubstringIn3DUand3PL CommonSubstringIn1DUINCLand1PLINCL (and other variations on the pronouns) TwoInDU ThreeInPL
Regarding the labial-velars:
Does it make sense to compute other, more general features, that are found in the Lexibank paper, such as: VowelSize ConsonantSize CVRatio etc. in Table 4
@tihomirrangelov -- yes.
Does it make sense to compute other, more general features, that are found in the Lexibank paper, such as: VowelSize ConsonantSize CVRatio etc. in Table 4
Yes, agreed.
Most of the colexifications in the Lexibank paper cannot be inferred from the VV data. The following should be possible:
Red and Yellow ThreeInEight HairAndFeather HearAndSmell CommonSubstringInManAndWoman
We could also do things like: CommonSubstringIn3DUand3PL CommonSubstringIn1DUINCLand1PLINCL (and other variations on the pronouns) TwoInDU ThreeInPL
@tihomirrangelov - agree on the pronouns, but let's discuss the others. I like @LinguList idea of incorporating the numeral systems.
Now, we need 10 lexical features (like colexification, partial colexification, or the like). You could even have features like 3 recurs in 8 (which is an indicator of quinal systems, etc.).
@tihomirrangelov and I have discussed the following. These are more than 10, but some may lead to dead-ends, so we can have more to play around with if others show more interesting results:
RedAndYellow CommonSubstringInRedAndBlood CommonSubstringInMoonAndWhite CommonSubstringInNightAndBlack CommonSubstringInDirtyAndBlack CommonSubstringInCloudAndNight CommonSubstringInCloudAndSky CommonSubstringInGreenAndNew OneInSix TwoInSeven ThreeInEight FourInNine TwoInTen FiveInTen PersonInTwenty ManInTwenty HairAndFeather HearAndSmell CommonSubstringIn3DUAnd3PL CommonSubstringIn1DUINCLAnd1PLINCL CommonSubstringIn1DUEXCLAnd1PLEXCL TwoInDU ThreeInPL
And we agreed on the following 20 phonology-related features:
VowelSize ConsonantSize CVRatio HasPrensalization VoicingDistinctionInFricatives HasPrenasalizedBilabialTrill HasPlainBilabialTrill HasPrenasalizedCoronalTrill HasLinguolabials HasLabializedConsonants HasLabialVelars HasHandX (both velar and glottal fricative) LacksP LacksPandC HasFrontRoundedVowels HasSchwa HasTwoAffricates SyllableStructure SyllableOnset SyllableOffset
Nice, we will get back to this, once we have processed all data and created orthography profiles. I'll wait until @Bibiko is back from holiday to schedule and divide work (I'll provide the functions, and @Bibiko can help with the overall script in lexibank style, if time allows).
Mary and I discussed the computed features for VV. The following have been identified as interesting features for Vanuatu lgs: