If you use these data please cite
Hantgan, Abbie and Babiker, Hiba and List, Johann-Mattis (2022): First steps towards the detection of contact layers in Bangime: a multi-disciplinary, computer-assisted approach [version 2; peer review: 2 approved]. Open Research Europe 2022, 2:10.
This dataset is licensed under a CC-BY-4.0 license
Available online at http://digling.org/links/bangime.html
Conceptlists in Concepticon:
The data in EDICTOR can be accessed from https://digling.org/links/bangime.html.
To run the analysis, make sure to install all requirements:
pip install -e ".[full]"
Also make sure to clone all repositories of Concepticon, Glottolog, and CLTS:
mkdir repos
cd repos
git clone https://github.com/glottolog/glottolog.git
git clone https://github.com/concepticon/concepticon-data.git
git clone https://github.com/cldf-clts/clts
The data is annotated with the help of the EDICTOR tool, where you can also inspect it using the link https://digling.org/edictor/http://digling.org/edictor/?remote_dbase=bangime&file=bangime.
To download the most recent version of the data programmatically, type:
cldfbench download lexibank_baf2.py
In order to convert the updated data to cldf, run:
cldfbench lexibank.makecldf lexibank_baf2.py --concepticon-version=v3.2.0 --glottolog-version=v5.0 --clts-version=v2.3.0
In order to run the cognate and borrowing detection analysis, run:
cldfbench baf2.borrowing
This analysis will create a file wordlist.tsv
in the folder analysis
. Note that the analysis itself was only done once in the beginning of our investigation and later manually updated. As a result, the results of this comparison necessarily differ from the results of the manually updated version.
To analyze the data, you can first compute average statistics of borrowed items:
cldfbench baf2.average
This will create a file relations.md
in the folder analysis
.
To count shared borrowing candidates, type:
cldfbench baf2.count
This will create a file analysis/patterns.tsv
.
To yield the same for all language subgroups in the sample, type:
cldfbench baf2.count-subgroup
This will write the patterns to the file analysis/patterns-subgroups.tsv
.
To yield the same for all languages in the sample, type:
cldfbench baf2.count-language
This will write the patterns to the file analysis/patterns-subgroups.tsv
.
Name | GitHub user | Description | Role |
---|---|---|---|
Abbie Hantgan | IndianaTones | Data collection, orthography | Author |
Hiba Babiker | Data collection, orthography | Author | |
Johann-Mattis List | @LinguList | maintainer | Author, Editor |
The following CLDF datasets are available in cldf: