Closed LinguList closed 2 years ago
I had a look at the possible lists:
@LinguList Would it make sense if I created a new list of available concepts in the current Concepticon version and added a tag column? For example: ID | CONCEPT | TAG |
---|---|---|
18 | EARLOBE | bodypart |
837 | BLUE | color |
3749 | HATE (LOATHING) | emotion |
In fact, why not. This is useful metadata information. What we can also do, if you present this list as a blog post before, is computing COVERAGE with our current lexibank data collection. I'd then in time show you how to do so, but this means, you would have for each concept a direct overview on the number of languages, language families, and the state of the data (transcription, only orthographic form, etc.).
This would actually be anyway an important step here for the analysis. You'd also learn how to run CL Toolkit on all of the lexibank data (which is a bit time consuming, but useful), and we'd have a closer integration with pylexibank and could use this later to filter the data we want to use!
Cool! That's exactly what I was thinking of. I'll start working on the list and get back to you when it is ready in order to test the CL toolkit part.
I suggest the following concept lists for testing:
The coverage statistics should count for all languages and language families, how many gaps we find in the data. It should be straightforward to do it in cltoolkit.