laurestine / needandambiguity

A repository for a project relating communicative need to ambiguity in semantic domains.
0 stars 0 forks source link

Use the dedicated APIs rather than hand-curated CSV files to access data in CLICS and Concepticon #1

Open LinguList opened 2 years ago

LinguList commented 2 years ago

I read your paper with great interest and am happy you shared the code online. However, I realized that you probably do not know that we have dedicated Python APIs which you can conveniently use to access data in Concepticon and by now also in CLICS. This avoids problems you describe in the paper, and it also makes the code more transparent, as you can point to specific versions of the Concepticon project or the individual CLICS datasets, you used in your study.

To access CLICS, please check a recent blogpost that explains how to load original wordlist data. This also emphasizes that CLICS is not a database, but an aggregation of standardized datasets, so you should access the original datasets, which we curated, rather than data from CLICS (see also our most recent study here).

To access Concepticon, just use pyconcepticon. This probably makes your live much easier, as you can also create your CSV file with the data from Concepticon directly from that resource. Let me know if there are further questions on this API, in fact, we regularly present our tools in our blog at https://calc.hypotheses.org, so I recommend to have a look or otherwise, ask us on GitHub (e.g., https://github.com/clics/clics3 or https://github.com/concepticon/concepticon-data).

laurestine commented 2 years ago

These are very good to know about. I'll make sure to remember them if I return to this project, and pass them along to anyone else who takes it up.