Glottolog_look_up_table
This is a small repos with an R script that cobbles together a large tsv sheet with every languoid in Glottolog (i.e. languages, dialects and families) with various meta data. Besides the "regular" Glottolog meta data (Family, long/lat, Macroarea, Countries, etc) there are also some extra info and modifications, see lists below.
Basic meta-data from Glottolog
- Longitude/Latitude
- Level (language/dialect/family)
- Macroarea (Australia, South America, North America, Eurasia, Africa, Papunesia)
- Name
- ISO 6393-3
- Glottocode
- Parent-ID
- Top-genetic unit ID
- Path (from languoid to root of tree)
- Countries
Added meta-data
- AUTOTYP-area (based on this csv: https://raw.githubusercontent.com/autotyp/autotyp-data/master/data/Register.csv )
- Name_stripped (ASCII and stripped for certain interprunctiation that certain programs, for example SplitsTree, struggles with)
- Family_name
- Isolate have a family name and family ID (the name and glottocode of the language) and there is a separate column that distinguishes Isolates from non-Isolates ("Isolate")
- instead of just dialects having an ID for their parent that is a language, languages also have their own IDs as the "language_level_ID"
- all dialects inherits meta-data from their language-level parents
- Family_color is a distinctive colour in HEX code for every family, including seprate ones for each Isolate
todo
How is x determined?
I'm just reshuffling Glottolog and AUTOTYP, I'm not making decisions about languages. If you want to know how some of these things are determined, see the list below: