Closed laiyunfan closed 9 months ago
Nice, so we make an update with these new files plus the one's you sent before, I think I can do this in November. I prefer to update ones. So if you tell me that you're ready, I'll then schedule when to combine those, okay?
Yes, sure!
Hi Mattis, I have uploaded five more wordlists (the descriptions are in the update information). I think that I am now ready for cognacy review once we have included the new wordlists. I will tell you more about my paper plan later on.
Hi Yunfan. The way in which I would proceed now is:
Hi Mattis. I agree with the three points you listed. I believe that it is the most efficient way. Let's do it this way. Thank you very much!
Can you please list for me all wordlists that are new, and also the names of the languages, and glottocodes?
Sure.
updates/2023-12-13/update.py:46: UserWarning: mBrongrzongKhroskyabs.tsv does not exist
Here's a first overview:
Doculect | Items | Coverage |
---|---|---|
Bantawa | 253 | 0.790625 |
Bawang Horpa | 282 | 0.88125 |
Bragbar Situ | 290 | 0.90625 |
Cogtse Situ | 268 | 0.8375 |
Geshiza | 267 | 0.834375 |
Guanyinqiao Khroskyabs | 290 | 0.90625 |
Japhug | 293 | 0.915625 |
Kangding Minyag | 236 | 0.7375 |
Kyomkyo Situ | 236 | 0.7375 |
Mazur Stau | 297 | 0.928125 |
Mbarkhams Situ | 241 | 0.753125 |
Ngyaltsu Zbu | 291 | 0.909375 |
Njorogs Khroskyabs | 294 | 0.91875 |
Old Burmese | 228 | 0.7125 |
Pengbuxi Minyag | 292 | 0.9125 |
Pubarong Queyu | 276 | 0.8625 |
Shimuliu Japhug | 261 | 0.815625 |
Situ in Jarongyiyu | 136 | 0.425 |
Siyuewu Khroskyabs | 289 | 0.903125 |
Stau | 237 | 0.740625 |
Tangut | 273 | 0.853125 |
Tshobdun | 216 | 0.675 |
Wobzi Khroskyabs | 290 | 0.90625 |
Xinlong Queyu | 236 | 0.7375 |
Yaoji Situ | 251 | 0.784375 |
Zhaba | 239 | 0.746875 |
Zlarong | 239 | 0.746875 |
Thanks! Sorry, it should be mBrongrdzongKhroskyabs.tsv. (the warning lacks a d before z)
@laiyunfan, I update this now, you find a folder called edictor/update/2023-12-13
in this folder, there is the new wordlist, new_data.tsv
, please check this in edictor. If you think it is fine, I update the data base online.
Doculect | Items | Coverage |
---|---|---|
Bantawa | 253 | 0.790625 |
Bawang Horpa | 282 | 0.88125 |
Bragbar Situ | 290 | 0.90625 |
Cogtse Situ | 268 | 0.8375 |
Geshiza | 267 | 0.834375 |
Guanyinqiao Khroskyabs | 290 | 0.90625 |
Japhug | 293 | 0.915625 |
Kangding Minyag | 236 | 0.7375 |
Kyomkyo Situ | 236 | 0.7375 |
Mazur Stau | 297 | 0.928125 |
Mbarkhams Situ | 241 | 0.753125 |
mBrongrdzon Khroskyabs | 285 | 0.890625 |
Ngyaltsu Zbu | 291 | 0.909375 |
Njorogs Khroskyabs | 294 | 0.91875 |
Old Burmese | 228 | 0.7125 |
Pengbuxi Minyag | 292 | 0.9125 |
Pubarong Queyu | 276 | 0.8625 |
Shimuliu Japhug | 261 | 0.815625 |
Situ in Jarongyiyu | 136 | 0.425 |
Siyuewu Khroskyabs | 289 | 0.903125 |
Stau | 237 | 0.740625 |
Tangut | 273 | 0.853125 |
Tshobdun | 216 | 0.675 |
Wobzi Khroskyabs | 290 | 0.90625 |
Xinlong Queyu | 236 | 0.7375 |
Yaoji Situ | 251 | 0.784375 |
Zhaba | 239 | 0.746875 |
Zlarong | 239 | 0.746875 |
Thanks! I will check this as soon as possible.
I make a PR, so you also see the code (which you should be able to run):
pip install lexibase
pip install pyedictor
pip install lingpy
I think more packages are not needed.
sure, thank you!
@laiyunfan, I updaed the code, can you check the link? You find it here: https://github.com/lexibank/lairgyalrong/blob/master/edictor/link.md
Please copy-paste the link, do not click on teh field, that does not work.
Hi, Mattis, the link works for me. Can I review the cognate judgements with this now? (I've been busy marking students' term papers this week, so I am a bit behind schedule, but I'll start quickly)
Yes, you can!
Thanks!
Hi Mattis,
I just found some new wordlists and would like to include them in the database. These include:
Jackson Sun (1996) is a source that had never been disclosed (although the document says it can be freely used since 2000). A friend of mine from Taiwan found it from the dusted documents in Academia Sinica.
I will keep you updated when these are ready. It shouldn't take long.
Best,