concepticon / concepticon-data

The curation repository for the data behind Concepticon.
https://concepticon.clld.org
32 stars 36 forks source link

Urban-2018-225 #750

Open Kristina-Pianykh opened 4 years ago

Kristina-Pianykh commented 4 years ago

This is another basic wordlist, based on Leipzig-Jakarta list and 100- and 200-item lists of Swadesh, but what makes it interesting is its semantic organization for cognate searches. The problem is that it uses curly brackets, indentation and empty rows for describing associations between lexemes. If this list is to be added to the Concepticon, we need to decide how these representations should be handled.

The PDF can be found here.

LinguList commented 4 years ago

Nice spot. I think this can be easily represented with numerical indices. If a bracket is used, one gives "1" for a first index for all concepts that share this bracket. Then "2" for the next bracket, etc. We have a similar case with Wilkinson's 1996 list, if I remember properly.

It's a pity that Urban did not offer links to concepticon, given that he should know of it, as one of the founding people of CLICS.

LinguList commented 4 years ago

Ah, if a concept has two brackets, just separate indices by a space: "1 2". This can be annotated also in metadata.json, thus saying: this column, which you could call "colexifications", contains a "list", and the separator is a space. @chrzyki can help with the syntax here.

chrzyki commented 4 years ago

Is this something we want to have in 2.3? I'll update the milestone accordingly.

LinguList commented 4 years ago

Depends. The number of lists in 2.3 should be dividable by 5.

LinguList commented 3 years ago

The text is here: http://kmsi.uni.wroc.pl/upload/files/11_Urban%281%29.pdf