phoible / dev

PHOIBLE data and development.
https://phoible.org/
GNU General Public License v3.0
115 stars 30 forks source link

Feature vectors for allophones that aren't phonemes #368

Open lggruspe opened 1 year ago

lggruspe commented 1 year ago

Some segments appear in the PHOIBLE data as allophones, but not as phonemes in any language.

Examples:

phoible.csv doesn't seem to have feature vectors for these allophones.

drammock commented 1 year ago

That's correct. We have a student working on this right now. But we're not sure yet how to provide them; they can't be part of phoible.csv because it has one row per phoneme (not one per allophone). Can you tell us about your use case / what would be the best format from your perspective?

lggruspe commented 1 year ago

I was only looking to compare the features of with ʈʂ. Phoible uses both symbols (possibly to represent different sounds), but Wikipedia says they represent the same sound.

drammock commented 1 year ago

looks like a mistake to me; we try to enforce that affricates have place-matching between the stop part and the fricative part. Such mistakes are more likely in the allophones because they aren't run through the same validation code that the phonemes are; though as I said we have a student working on this right now so hopefully soon many of these allophone errors will get corrected.

cc @Alessioryan

Alessioryan commented 1 year ago

@drammock Would you be able to send me the validation code for the phonemes? I'd love to take a look at this issue, I hadn't noticed it prior.