brown-uk / dict_uk

Project to generate POS tag dictionary for Ukrainian language
GNU General Public License v3.0
550 stars 71 forks source link

Wrong entry in th_uk_UA.dat #267

Closed mikekaganski closed 3 years ago

mikekaganski commented 3 years ago

https://gerrit.libreoffice.org/c/dictionaries/+/114308 has merged this project's OOo/LO data to LibreOffice. But it started to fail with

Unable to read count from "відданий державі|державницький|здатний державно мислити|з державним мисленням" input. make[1]: *** [/tinderbox/buildslave/source/libo-master/solenv/gbuild/Dictionary.mk:31: /tinderbox/buildslave/build/workdir/ThesaurusIndexTarget/dictionaries/uk_UA/th_uk_UA.idx] Error 99

The corresponding line is https://github.com/brown-uk/dict_uk/blob/master/distr/openoffice.org/thesaurus/th_uk_UA.dat#L5181, and it is obviously wrong: the previous entry is дезамінуючий|1 (see the count 1), followed by |дезамінаторний|здатний дезамінувати|дезамінатор, and thus the next line should be a new entry with a number. Likely this should be two lines:

державнодумаючий|1 |державно орієнтований|відданий державі|державницький|здатний державно мислити|з державним мисленням

mikekaganski commented 3 years ago

See https://git.libreoffice.org/dictionaries/+/cbda6f487f9b760acb906b8e1280fe009b0a3461%5E%21/#F0 (the band-aid fix).

arysin commented 3 years ago

Thank you @mikekaganski, I've pushed a fix for those lines.