cldf-clts / clts-legacy

Cross-Linguistic Transcription Systems
Apache License 2.0
4 stars 3 forks source link

[New TranscriptionData] Database of Eurasian Phonological Inventories #59

Closed LinguList closed 6 years ago

LinguList commented 6 years ago

Interesting, they offer their download as json, and they plot all their segments in feature charts. Nice.

LinguList commented 6 years ago

Even better: they show syllable structures, their json dump is quite rich.

LinguList commented 6 years ago

And they have source code that parses IPA!

LinguList commented 6 years ago

So this means, we can use their data to check against ours, and we can even use their code to derive their feature sets (also good for comparison).

tresoldi commented 6 years ago

Very, very nice, indeed! Just a pity they don't cite the actual sources, only that

[t]he data were gleaned from grammatical descriptions of individual language varieties and reference works on language families. No recycling of existing databases was undertaken."

2017-12-17 18:55 GMT-02:00 Johann-Mattis List notifications@github.com:

So this means, we can use their data to check against ours, and we can even use their code to derive their feature sets (also good for comparison).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cldf/clts/issues/59#issuecomment-352284909, or mute the thread https://github.com/notifications/unsubscribe-auth/AAar93wU5tu1eL8SUA4aptBOLAxrXTwKks5tBX_cgaJpZM4REyl3 .

tresoldi commented 6 years ago

Oh, the sources are actually there in individual language inventory listing! Really good!

tresoldi commented 6 years ago

Found the paper: https://www.academia.edu/35110101/The_Database_of_Eurasian_Phonological_Inventories_a_research_tool_for_distributional_phonological_typology

LinguList commented 6 years ago

Yes, spotted this right away when I inspected their JSON. The Python code looks very convenient to use, and it is comforting to see that despite of their nice code and feature system, it is still clear that a system like CLTS is more flexible to rule the big bulk of inconsistencies and communicate between different systems, so we can clearly profit from linking to this dataset and adding features we still don't cover (like pre-labialization), without stepping on each others shoes.

LinguList commented 6 years ago

Yes, that's how I found the database, I should've posted it here as well, the paper.