Closed LinguList closed 1 week ago
Name | Words | Proportion |
---|---|---|
AmamiAsama | 252 | 0.992126 |
AmamiYamatohama | 252 | 0.992126 |
AmamiYoron | 250 | 0.984252 |
Azeri | 252 | 0.992126 |
Baoan | 250 | 0.984252 |
BarabaTatar | 210 | 0.826772 |
Bashkir | 254 | 1 |
Buriat | 254 | 1 |
Chuvash | 248 | 0.976378 |
CodexCumanicus | 214 | 0.84252 |
CrimeanTatar | 248 | 0.976378 |
Dagur | 251 | 0.988189 |
Dolgan | 235 | 0.925197 |
Dongxian | 246 | 0.968504 |
EasternEvenki | 100 | 0.393701 |
Even | 253 | 0.996063 |
EvenkiKamnigan | 209 | 0.822835 |
Fukuoka | 230 | 0.905512 |
Gagauz | 250 | 0.984252 |
Gangwon | 253 | 0.996063 |
Gyeonggi | 254 | 1 |
Hachijo | 230 | 0.905512 |
Hezhe | 244 | 0.96063 |
Huzhu | 250 | 0.984252 |
Hwanghae | 254 | 1 |
Japanese | 254 | 1 |
Jeju | 252 | 0.992126 |
Jurchen | 180 | 0.708661 |
Kagoshima | 238 | 0.937008 |
Kalmyck | 253 | 0.996063 |
Kamnigan | 249 | 0.980315 |
Kangjia | 239 | 0.940945 |
KarachayBalkar | 250 | 0.984252 |
Karaim | 252 | 0.992126 |
KaraKalpak | 250 | 0.984252 |
Kazakh | 250 | 0.984252 |
KazanTatar | 252 | 0.992126 |
Khakas | 252 | 0.992126 |
Khalaj | 228 | 0.897638 |
Khalkha | 253 | 0.996063 |
Kirghiz | 252 | 0.992126 |
Korean | 245 | 0.964567 |
Koshikiislands | 236 | 0.929134 |
Kumamoto | 227 | 0.893701 |
Kumyk | 251 | 0.988189 |
KurUrmi | 251 | 0.988189 |
LateMiddleKorean | 251 | 0.988189 |
Manchu | 251 | 0.988189 |
MiddleChulym | 237 | 0.933071 |
MiddleMongolianMuqaddimataladab | 216 | 0.850394 |
MiddleMongolianSecretHistory | 196 | 0.771654 |
Minhe | 241 | 0.948819 |
MiyakoIrabu | 249 | 0.980315 |
Moghol | 152 | 0.598425 |
NanaiBikin | 254 | 1 |
NanaiMiddleAmur | 253 | 0.996063 |
Negidal | 250 | 0.984252 |
Nogai | 252 | 0.992126 |
NorthAltai | 242 | 0.952756 |
NorthernChungcheong | 253 | 0.996063 |
NorthernEvenkiTura | 247 | 0.972441 |
NorthernEvenkiTutonchany | 217 | 0.854331 |
NorthernGyeongsang | 253 | 0.996063 |
NorthernHamgyong | 254 | 1 |
NorthernJeolla | 253 | 0.996063 |
NorthernPyongan | 254 | 1 |
Oirat | 248 | 0.976378 |
OkinawaShuri | 250 | 0.984252 |
OkinawaYonamine | 248 | 0.976378 |
OldJapanese | 250 | 0.984252 |
OldTurkic | 232 | 0.913386 |
Oroch | 253 | 0.996063 |
Orok | 254 | 1 |
Oroqen | 249 | 0.980315 |
Salar | 226 | 0.889764 |
ShiraYughur | 252 | 0.992126 |
Shor | 249 | 0.980315 |
Solon | 252 | 0.992126 |
SouthAltai | 249 | 0.980315 |
SouthernChungcheong | 253 | 0.996063 |
SouthernEvenkiChiringda | 204 | 0.80315 |
SouthernEvenkiVershinaTuturyBaikal | 54 | 0.212598 |
SouthernGyeongsang | 253 | 0.996063 |
SouthernHamgyong | 254 | 1 |
SouthernJeolla | 253 | 0.996063 |
SouthernPyongan | 254 | 1 |
StonyEvenkiPTPodkamennayaTunguska | 252 | 0.992126 |
Tofa | 246 | 0.968504 |
Turkish | 253 | 0.996063 |
Turkmen | 251 | 0.988189 |
Tuvan | 247 | 0.972441 |
Udihe | 253 | 0.996063 |
Ulcha | 254 | 1 |
Uyghur | 251 | 0.988189 |
Uzbek | 248 | 0.976378 |
WestYugur | 238 | 0.937008 |
Xibe | 250 | 0.984252 |
YaeyamaHatoma | 251 | 0.988189 |
YaeyamaIshigaki | 250 | 0.984252 |
Yakut | 250 | 0.984252 |
Yonaguni | 251 | 0.988189 |
The language coverate looks good, as far as I can tell. But there are outliers.
@tpellard, the code I used for this with lingpy and tabulate (both on pip
):
from lingpy import *
from tabulate import tabulate
wl = Wordlist.from_cldf('cldf/cldf-metadata.json', columns=["language_id", "parameter_name", "value", "form", "segments", "cognacy"])
table = []
for language, forms in wl.coverage().items():
table += [[language, forms, forms/wl.height]]
print(tabulate(table, headers=["Name", "Words", "Proportion"], tablefmt="pipe"))