-
It seems that while there is support for tokenization with diacritics in spaCy, the project doesn't lemmatize/morph/pos tag correctly when they are used.
## How to reproduce the behaviour
```
imp…
mtak- updated
8 months ago
-
Hi
I'm glad, that TNTSearch can use russian morphology and match of words variations
But... When I'm try highlight finding part - I'm get empty string
![image](https://user-images.githubusercontent…
-
Czech, Ukrainian, Russian have this construction for _both A and B_: _jak A, tak i B_.
However, the annotation is inconsistent among the languages and inside Czech.
This is also about general proble…
-
We could benefit from a new function that generates all possible word forms for a word. Right now word forms are not lemmatized within YoastSEO.js. This means that if the user performs, for instance, …
-
Александр Клюквин (alexander.klukvin at gmail) has created a new version of this dictionary. He has done huge amount of work in several years. It might be reasonable to switch to his version. It's mor…
-
**Is your feature request related to a problem? Please describe.**
Base language of my project is Russian so my glossary is Russian>Other language
But some terms are not shown if they are in differe…
-
Attempting to get the plugin running on FreeBSD and the command:
bundle exec rake redmine_elasticsearch:reindex_all RAILS_ENV=production
yields
```
Loading Rails environment for Resque
Rec…
-
One of the challenges in search is recall of an item with a common typing variant. These cases can be as simple as lower/upper case in most languages, accented characters, or more complex morphologic…
-
I have a suggestion to implement external morphology system in GD besides Hunspell.
The main goal: to replace word stemming. Spelling suggestion is out of my scope.
external morphology system is just …
-
Hello!
I stumbled upon this error during tagger training on some part of Taiga corpus of Russian language (~1 Gb of texts): ```"An error occurred during model training: Should encode value 65536 in o…