lexibank / vanuatuvoices

Sound-Comparisons Vanuatu
Other
3 stars 1 forks source link

computed features need to be discussed by the VV team (inspecting what we used in Lexibank) #25

Open maryewal opened 2 years ago

tihomirrangelov commented 2 years ago

Mary and I discussed the computed features for VV. The following have been identified as interesting features for Vanuatu lgs:

LinguList commented 2 years ago

Okay, we will need to standardize the annotations here, as we write pre-nasalization only with a superscript n, as the articulation follows from the sound that follows (and inconsistencies are huge). Also, if there are labiovelars in the data, we'll have to double-check that this works with the orthography profiles and have to hope they are properly transcribed here (we write kp, gb, without the bar, as we use segmented data). But these features are easy to compute.

These are 9 features, 10 would make it nicer (also for any maps and the like in any publications).

LinguList commented 2 years ago

Now, we need 10 lexical features (like colexification, partial colexification, or the like). You could even have features like 3 recurs in 8 (which is an indicator of quinal systems, etc.).

tihomirrangelov commented 2 years ago

Most of the colexifications in the Lexibank paper cannot be inferred from the VV data. The following should be possible:

Red and Yellow ThreeInEight HairAndFeather HearAndSmell CommonSubstringInManAndWoman

We could also do things like: CommonSubstringIn3DUand3PL CommonSubstringIn1DUINCLand1PLINCL (and other variations on the pronouns) TwoInDU ThreeInPL

tihomirrangelov commented 2 years ago

Regarding the labial-velars:

  1. I edited the comment to call them exactly that and not "labiovelars", which has also been used for the labialized consonants (bw, pw etc.)
  2. I know they occur on Efate and Torba, so I am not sure whether we will have to deal with them at this stage, unless @maryewal knows otherwise.
tihomirrangelov commented 2 years ago

Does it make sense to compute other, more general features, that are found in the Lexibank paper, such as: VowelSize ConsonantSize CVRatio etc. in Table 4

SimonGreenhill commented 2 years ago

@tihomirrangelov -- yes.

maryewal commented 2 years ago

Does it make sense to compute other, more general features, that are found in the Lexibank paper, such as: VowelSize ConsonantSize CVRatio etc. in Table 4

Yes, agreed.

maryewal commented 2 years ago

Most of the colexifications in the Lexibank paper cannot be inferred from the VV data. The following should be possible:

Red and Yellow ThreeInEight HairAndFeather HearAndSmell CommonSubstringInManAndWoman

We could also do things like: CommonSubstringIn3DUand3PL CommonSubstringIn1DUINCLand1PLINCL (and other variations on the pronouns) TwoInDU ThreeInPL

@tihomirrangelov - agree on the pronouns, but let's discuss the others. I like @LinguList idea of incorporating the numeral systems.

maryewal commented 2 years ago

Now, we need 10 lexical features (like colexification, partial colexification, or the like). You could even have features like 3 recurs in 8 (which is an indicator of quinal systems, etc.).

@tihomirrangelov and I have discussed the following. These are more than 10, but some may lead to dead-ends, so we can have more to play around with if others show more interesting results:

RedAndYellow CommonSubstringInRedAndBlood CommonSubstringInMoonAndWhite CommonSubstringInNightAndBlack CommonSubstringInDirtyAndBlack CommonSubstringInCloudAndNight CommonSubstringInCloudAndSky CommonSubstringInGreenAndNew OneInSix TwoInSeven ThreeInEight FourInNine TwoInTen FiveInTen PersonInTwenty ManInTwenty HairAndFeather HearAndSmell CommonSubstringIn3DUAnd3PL CommonSubstringIn1DUINCLAnd1PLINCL CommonSubstringIn1DUEXCLAnd1PLEXCL TwoInDU ThreeInPL

tihomirrangelov commented 2 years ago

And we agreed on the following 20 phonology-related features:

VowelSize ConsonantSize CVRatio HasPrensalization VoicingDistinctionInFricatives HasPrenasalizedBilabialTrill HasPlainBilabialTrill HasPrenasalizedCoronalTrill HasLinguolabials HasLabializedConsonants HasLabialVelars HasHandX (both velar and glottal fricative) LacksP LacksPandC HasFrontRoundedVowels HasSchwa HasTwoAffricates SyllableStructure SyllableOnset SyllableOffset

LinguList commented 2 years ago

Nice, we will get back to this, once we have processed all data and created orthography profiles. I'll wait until @Bibiko is back from holiday to schedule and divide work (I'll provide the functions, and @Bibiko can help with the overall script in lexibank style, if time allows).