Open kostyfisik opened 8 years ago
It is more important for rule triggering. "будет найдёт минимум" triggers the rule, "будет найдет минимум" does not.
The current dictionary is focused on the consistent use of the letter "Ё". This allows the program to more correctly process texts. But there are many books in which the letter "Ё" is not used. There was even an constructed version of the dictionary with a peer using the letters "Ё" and "E". But for the Russian language there is only one locale "ru-ru". At the moment, there is no way to select a different dictionary.
Does it mean that there is a need to provide a ru_RU@yo locale? (so this will correspond to ISO standart [language[_territory][.codeset][@modifier]] notation) This will make is possible to completely ignore "ё" in ru_RU and to leave an optional ё support in ru_RU@yo.
Does it mean that there is a need to provide a ru_RU@yo locale? Yes. But this locale is not present in LibreOffice and OpenOffice.
But locale must be ru_RU@ie - for "е" only. ru_RU - with "Ё".
@yakovru Probably *Office dev do not have enough butthurt for sentence like "Ежик Алеша ел под елкой" Anyway, there is a "no yo" tradition. Is it hard to introduce ru_RU@noyo locale in LT? (ru_RU@ie is a bad name, due to presence of IE web browser, which was default for Windows OS for many years)
ru_RU@noyo is bad idea because some word must be write with "Ё" anyway (family, name, surname, etc.) ru_RU@e is best.
Is it hard to introduce ru_RU@e in LT?
I think it is possible.
I'll try to do it.
This does make sense as words with misspelled "е\ё" are not corrected with spellchecker. Current behaviour of Tagger is inconsistent. The spellchecker says that the word "мед" is correct but the tagger knows nothing about this word.
The word "мед" is a abbreviation from word "медицинский" like "мед. изделия".
The word "мёд" mean "honey".
You are right it can be "медицинский" like "мед. изделия", but often "мед." in this case will contain dot at the end. You can still find books were "мед" means "honey" The words "ежик", "перепелка", "веселый", "лед" can be read by native speakers without any difficulties. I know that it could be arguable, but many authoritative sources like contains the following guidelines:
Азбучная истина № 7. Употребление буквы ё обязательно в текстах с последовательно поставленными знаками ударения, в книгах для детей младшего возраста (в том числе учебниках для школьников младших классов), в учебниках для иностранцев. В обычных печатных текстах ё рекомендуется писать в тех случаях, когда возможно неправильное прочтение слова, когда надо указать правильное произношение редкого слова или предупредить речевую ошибку. Букву ё следует также писать в собственных именах. В остальных случаях употребление ё факультативно, то есть необязательно.
The first comment shows obvious inconsistency of the current behavior:
Words with misspelled "е\ё" are not corrected with spellchecker (USSR typographical simplification allows this), however, such words are not detected with part-of-speech analysis.
The first comment (non working rules without ё) screenshot update
Now words with misspelled "е\ё" are added to POS tag dictionary for correct tagging.
@kostyfisik the user replied back. Here's the example sentence where the issue is apparently still happening:
"Уступить дорогу (не создавать помех)" - требование, означающее, что участник дорожного движения не должен начинать, возобновлять или продолжать движение, осуществлять какой-либо манёвр, если это может вынудить других участников движения, имеющих по отношению к нему преимущество, изменить направление движения или скорость.
@yakovru could you please take a look for this?
Both forms (манёвр + маневр) are valid according to the printed version of the dictionary, but (манёвр) is preferred. I'll check the other word forms included in the dictionary.
Вторник, 3 сентября 2019, 12:43 +03:00 от Christopher Blum:
@kostyfisik the user replied back. Here's the example sentence where the issue is apparently still happening:
"Уступить дорогу (не создавать помех)" - требование, означающее, что участник дорожного движения не должен начинать, возобновлять или продолжать движение, осуществлять какой-либо манёвр, если это может вынудить других участников движения, имеющих по отношению к нему преимущество, изменить направление движения или скорость.
Words with misspelled "е\ё" are not corrected with spellchecker (USSR typographical simplification allows this), however, such words are not detected with part-of-speech analysis.
ежик - - ёжик ёжик NN:Masc:Sin:Nom