Open LinguList opened 4 years ago
From the code, data seems to come entirely from here: https://github.com/pettarin/ipapy/blob/master/ipapy/data/ipa.dat
It is CSV structure, but the sound names are internally tab-separated. It could be mapped to CLTS BIPA by using the grapheme (which needs to be normalized), the name descriptor, or both. Some manual refinement/checking is also necessary.
There is also an arpabet
and a kirshenbaum
resource in the same directory.
So how feasible is it to make a table and add it to our sources?
Does not look too complex, it is more a question on how reproducible it should be (i.e., should there be a nice script to download data, parse it, etc.?), whether to include entries that are commented out, and similar decision.
However, I cannot see any mention to a peer-reviewed publication. Didn't we decide to add only either peer-reviewed work or resources very much established in the community?
Before we discuss this for too long, let's forget it for now and concentrate on those datasts which I added (see milestone 1.3). These all should be mapped, using the new pyclts approach, and manually corrected.
ipapy offers some feature representations of some ipa characters / letters. The problem is that they need to be extracted somehow, it is not entirely clear. But I think it would be nice to list for each CLTS BIPA character if it would be accepted by IPAPY and also how it would be encoded in terms of features there.