digling / intelligibility

MIT License
0 stars 0 forks source link

New Data #4

Closed LinguList closed 10 months ago

LinguList commented 1 year ago

@justalingwist I added the etymological data in data/etyma.tsv and the LSR data we received in folder data/lsr.tsv, along with scripts to create them. We have > 500 items in the etymological data and 296 in the LSR, here, we also have marked cases that are truly cognate (last column). We have, quite interesting, some 53 cognates occurring in my dataset (which must not be exhaustive, but is pretty big) and some 250 items (a bit less) that do not occur there and may likely be borrowings (I can check easily later).