The data underlying the Concepticon is maintained in this repository. Released versions of this data are distributed as CLDF datasets, uploaded to Zenodo from the concepticon-cldf repository
Here, you can find
conceptlists/ folder contains conceptlists with links to IDs in concepticon.tsv, the lists are named after the first person who proposed them, the year of the reference publication in which we extracted them, and the number of concepts. All these three parts of information are separated by a dash. Furthermore, in cases where two lists would have an identical name, we add alphabetical letters to the lists to distinguish them. Files need to have the columns "GLOSS" (some still have "ENGLISH" instead, but this needs to be changed), additionally, most (if not all files) have a "NUMBER" field indicating the number in the reference, which is also important for ordering the entries as given in the original source. Additional columns are more or less free to the user, but we tried to be consistent.
Some concept lists are based on sources that may change, thus require a mechanism for re-creation. In this case, there will a directory named after the list, containing the relevant curation scripts.
Concept lists may contain information about relations between concepts. If so, such relations must
be stored as content of columns named LINKED|SOURCE|TARGET_CONCEPTS
. The values for these columns
must be
ID
the value of which must be a concept identifier
in the list,Edges in the graph described in LINKED_CONCEPTS
are considered undirected, whereas edges in
SOURCE|TARGET_CONCEPTS
are considered directed, with the concepts specified in the edge objects
identifying the SOURCE
or TARGET
, respectively, of the edge.
Before release 3.0, this repository contained metadata linked to Concepticon concept sets. With release 3.0, this data moved to a separate (though related) project - NoRaRe. For the curation and publication workflow of NoRaRe data see https://github.com/concepticon
We try to release concepticon-data (as well as the CLDF dataset and the concepticon web app) regularly at least once a year. Generally, new releases should only become more comprehensive, i.e. all data ever released should also be part of the newest release. Occasionally, though, we may have to correct an erratum, which may result in some data being removed, or changes in identifiers of objects. So whenever a link to the web app breaks or a script using the concepticon-data API throws an error, you should consult the list of errata to see, whether an error correction may be the reason for this behaviour.
pyconcepticon provides a Python package to programmatically access Concepticon data.