lexibank / lsi

CLDF dataset derived from Grierson's "Linguistic Survey of India" from 1928
https://lsi.clld.org
Creative Commons Attribution 4.0 International
1 stars 0 forks source link

WALS Phonology correlations #19

Closed PhyloStar closed 3 years ago

PhyloStar commented 4 years ago

@lingulist suggested to look at extraction of WALS phonological features from LSI vocabulary list and correlate with WALS features. I think features 1A-4A, 8A, 9A, 10A, 11A, 12A, 13A are worth investigating for our dataset. I have two questions here @xrotwang.

LinguList commented 4 years ago

WALS is in cldf now, so it should be straightforward to use the API to get the data needed. I can have a look at it later. If we make this full workflow all based on cldf, it is great, as we don't have to submit the data, we can just point to WALS versions, etc.

LinguList commented 4 years ago

@PhyloStar, please check here https://github.com/lexibank/lsi/commit/21980601f2c605dc2f65a79e149847968f050ff5

@xrotwang added this to show how you can compare the features from cldf.

Usage:

$ cldfbench lsi.compare_inventories asjp_dataset/cldf/cldf-metadata.json phoible/cldf/Structuredataset.metadata.json

So you need to clone phoible (cldf-datasets/phoible) and asjp (lexibank/asjp) first.

This is recommended, as it means you just clone and don't convert data inside a repository.

PhyloStar commented 3 years ago

I am able to run the code without any problems. This is great!