Consonant clusters - Githubissues

FredericBlum commented 4 months ago

The following consonant clusters should be checked:

Language_ID	Length	Cluster	Words
AchuarShiwiar	3	n t r	akántratin // yántratin
AchuarShiwiar	3	∼ h k	jũ ũwé̃jka mash
Andoa	3	j h j	tasayjyá
Candoshi-Shapra	3	ŋ k tʃ	mpotsanoŋktʃi // waʂiitpoŋktʃi // wajaŋktʃi // mpaʂinoŋktʃi
Candoshi-Shapra	3	m p tʃ	patomptʃi
Candoshi-Shapra	3	p ʂ k	pʂkomaama
Canichana	3	t h m	itxmupse
Muniche	3	ʔ s m	ɾɨʔsma
Omurano	3	θ n n	θnn̄́
Omurano	3	t n n	natn̄n
Shiwilu	3	ɾ ʔ tʃ	pi’per’chapalli
Yameo	3	!n̊/n ʔ w	nan̊'wá
Candoshi-Shapra	4	p ʃ f m	kamopʃfmaama
Taushiro	4	ʔ w l tʃ	aʔwltʃa

MuffinLinwist commented 3 months ago

I'm on this now.

MuffinLinwist commented 3 months ago

Language_ID Length Cluster Words AchuarShiwiar 3 n t r akántratin // yántratin

I'm not sure about this one. Perhaps it is an epenthetic voiced oral stop that often appears between the nasal and the rhotic phonemes (Kohlberger 2020, 115-116, 135), which was misrepresented in the source text. Or, perhaps, it is a misheard voiced allophone of the affricate [dʒ] (97-98). The source text, however, does not make reference to any of this and I'll ask Jaime now.

AchuarShiwiar 3 ∼ h k jũ ũwé̃jka mash

This is already fixed with the last merged PR.

MuffinLinwist commented 3 months ago

Language_ID Length Cluster Words Andoa 3 j h j tasayjyá

Also not sure about this one. The source used for this dataset is a short wordlist from Michael et at. (2009). It has a brief but detailed ortographic and photenic description where it does not saying anything about this. I looked it on edictor and found some correlation with Andoa /hjV/ and Arabela /hiV/ at the end of the word (COGID 311 or COGIDS 1168 and 1169), where in other Zaparoan languages is /hV/ or /hʲV/. I consider that adapting the following strategy is the best: /jyV/ > [h !y/i/ V]

MuffinLinwist commented 3 months ago

Language_ID Length Cluster Words Candoshi-Shapra 3 ŋ k tʃ mpotsanoŋktʃi // waʂiitpoŋktʃi // wajaŋktʃi // mpaʂinoŋktʃi Candoshi-Shapra 3 m p tʃ patomptʃi

This has to do with the prenasalized consonants that Tuggy postulated but Overall rejected without giving these clusters as possible ones. So, perhaps, the best strategy here is to unify nC as you suggested on the last issue.

Candoshi-Shapra 3 p ʂ k pʂkomaama

This is a permitted cluster of three consonants (Overall 2023, 621)

Candoshi-Shapra 4 p ʃ f m kamopʃfmaama

Thanks for the catch. This is sadly a typo in the Lexibank data. The original source (Tuggy 2008, 19) cites /kamopshimaama/ for the concept. Should I fix this in the ortho-profile or is there other workflow for this? Since we pull the data from lexibank into the raw data, this error will re-produce each time one runs the scripts.

MuffinLinwist commented 3 months ago

Language_ID Length Cluster Words Canichana 3 t h m itxmupse

Seems to be correct. It corresponds to the data offered by other authors, as collected and showed in the source we use for the dataset (Crevels 2012, 423).

MuffinLinwist commented 3 months ago

Language_ID Length Cluster Words Muniche 3 ʔ s m ɾɨʔsma

This is also correct and we can find an example in Michael et al. (2013, 310: example 29b).

MuffinLinwist commented 3 months ago

Language_ID Length Cluster Words Omurano 3 θ n n θnn̄́ Omurano 3 t n n natn̄n

Consulting another source (O'Hagan 2011) that works with fieldwork and Tessmann data, seems that this is possible on the language and correct.

MuffinLinwist commented 3 months ago

Language_ID Length Cluster Words Shiwilu 3 ɾ ʔ tʃ pi’per’chapalli

Thanks for the catch. It was because a difference on ’ in this specific case. Already fixed.

MuffinLinwist commented 3 months ago

Language_ID Length Cluster Words Yameo 3 !n̊/n ʔ w nan̊'wá

Thanks! I already fixed this.

Taushiro 4 ʔ w l tʃ aʔwltʃa

This is also a case of typo in the Lexibank data. The original data gives the form /aʔwitʃa/ for the concept louse (Alicea 1975, 132).

FredericBlum commented 3 months ago

Language_ID Length Cluster Words Andoa 3 j h j tasayjyá

Also not sure about this one. The source used for this dataset is a short wordlist from Michael et at. (2009). It has a brief but detailed ortographic and photenic description where it does not saying anything about this. I looked it on edictor and found some correlation with Andoa /hjV/ and Arabela /hiV/ at the end of the word (COGID 311 or COGIDS 1168 and 1169), where in other Zaparoan languages is /hV/ or /hʲV/. I consider that adapting the following strategy is the best: /jyV/ > [h !y/i/ V]

Iquito: awi Arabela: a: Andoa: ay

Seems like the original sequence V Glide V was shortened in the other languages, so we will group here and see then if we see some further occurrences. Or take it as dphtong - are there any comments about diphtongs?

hja has hia as correspondence, as you have suggested, so yes, I think we can proceed as you suggested in that case.

Its nice to see that those consonant clusters point us to interesting things directly about the history of the language family I think. You should write down a note how the search for consonant clusters leads us to those patterns, we can use that in the paper.

FredericBlum commented 3 months ago

Language_ID Length Cluster Words Candoshi-Shapra 3 ŋ k tʃ mpotsanoŋktʃi // waʂiitpoŋktʃi // wajaŋktʃi // mpaʂinoŋktʃi Candoshi-Shapra 3 m p tʃ patomptʃi

This has to do with the prenasalized consonants that Tuggy postulated but Overall rejected without giving these clusters as possible ones. So, perhaps, the best strategy here is to unify nC as you suggested on the last issue.

Candoshi-Shapra 3 p ʂ k pʂkomaama

This is a permitted cluster of three consonants (Overall 2023, 621)

Candoshi-Shapra 4 p ʃ f m kamopʃfmaama

Thanks for the catch. This is sadly a typo in the Lexibank data. The original source (Tuggy 2008, 19) cites /kamopshimaama/ for the concept. Should I fix this in the ortho-profile or is there other workflow for this? Since we pull the data from lexibank into the raw data, this error will re-produce each time one runs the scripts.

I will set up a replacement in the Lexibank-script to fix this.

What was Overall's position? That those are two separate consonants? Then we should group them.

FredericBlum commented 3 months ago

Language_ID Length Cluster Words Omurano 3 θ n n θnn̄́ Omurano 3 t n n natn̄n

Consulting another source (O'Hagan 2011) that works with fieldwork and Tessmann data, seems that this is possible on the language and correct.

Weird, but okay :)

MuffinLinwist commented 3 months ago

What was Overall's position? That those are two separate consonants? Then we should group them.

yes, that they are two separate consonants but homorganic, instead of prenasalized consonants. Do we still group them, then?

FredericBlum commented 3 months ago

Yes, group them with the dot-grouping, instead of having them as a single consonant.

lexibank / northperulex

Consonant clusters #20