-
I have a little bit better understanding now of the paradigm you are having. It seems that tokenization performance is now much better, and code is much cleaner, excellent work!
Regarding the prep…
-
Currently the application is only available in Portuguese (en-us) and English (en-us). It would be great to find people willing to help with translations into other languages.
**Any translation is …
-
How to reproduce:
- go to the French or Spanish version
- in the filter facets, the names of files, of languages and of extraction level remains in English
-
A Sanskrit-Latin dictionary.
Short name BOP.
In stardict-sanskrit/sa-head/bopp folder.
-
-
-
Consider కొలము|कोलमु|kolamu. The root (prAtipadika) of this prathamA vibhakti ekavachana form is kola కొల. People in other languages often look up for the root, not the prathamA vibhakti form - so it …
-
The purpose of this issue is to document the conventions used in various dictionaries to represent
Sanskrit words.
Exactly how to construct such a documentation is unclear at the moment. So the…
-
I want to contribute by expanding corpus of Indian languages. Do corpus has to be old? By old I mean does it have to be written way back in time? Because I have some good corpus.
-
https://drafts.csswg.org/css-text-3/#word-break-property
แและ·ตัวอย่าง·การเขียน·ภาษาไทย has two many แ characters at the start.
Also, are we sure about the word segmentation for these examples? …