Open lggruspe opened 1 year ago
That's correct. We have a student working on this right now. But we're not sure yet how to provide them; they can't be part of phoible.csv because it has one row per phoneme (not one per allophone). Can you tell us about your use case / what would be the best format from your perspective?
I was only looking to compare the features of tʂ
with ʈʂ
. Phoible uses both symbols (possibly to represent different sounds), but Wikipedia says they represent the same sound.
tʂ
looks like a mistake to me; we try to enforce that affricates have place-matching between the stop part and the fricative part. Such mistakes are more likely in the allophones because they aren't run through the same validation code that the phonemes are; though as I said we have a student working on this right now so hopefully soon many of these allophone errors will get corrected.
cc @Alessioryan
@drammock Would you be able to send me the validation code for the phonemes? I'd love to take a look at this issue, I hadn't noticed it prior.
Some segments appear in the PHOIBLE data as allophones, but not as phonemes in any language.
Examples:
tʃː
is an allophone fort̠ʃ
in kuna1268tʂ
is an allophone fort̠ʃ
in yuch1247tʂʼ
is an allophone fort̠ʃʼ
in yuch1247phoible.csv
doesn't seem to have feature vectors for these allophones.