Closed qixils closed 3 years ago
Thanks for the detailed report, code and table that made it easy to analyze!
This is happening because the translation table (e.g., in "dag") has "Translations" in the header where the sense usually goes. I think the simplest solution is to check if the sense taken from the table header is "Translations", and if so, then just make it empty, as there is really no sense description available in those cases.
I already implemented this fix and it seems to work, at least for "dag". The pre-extracted data should have this fixed tomorrow (unless something goes wrong).
BTW, there will also be a new download link for the actual raw output of wiktextract (wiktwords) at the bottom of https://kaikki.org/dictionary/ starting tomorrow.
it's no problem at all, thanks for the quick fix :)
Using the kaikki.org all-language dump dated 2021-05-08, over 2,700 senses for translations say only "Translations" instead of an actual sense/definition or a null value.
Code used to create a table of all instances:
Truncated table of various examples