errata-ai / vale

:pencil: A markup-aware linter for prose built with speed and extensibility in mind.
https://vale.sh
MIT License
4.52k stars 155 forks source link

Dictionaries: `FLAG stanza not supported` and `SFX stanza had 11 fields, expected 4 or 5` #372

Closed cpkio closed 3 years ago

cpkio commented 3 years ago

It seems Vale cannot use some Hunspell dictionaries for now?

dir
│   .vale.ini
│   testdoc.rst
│
├───dicts
│       ru_RU.aff
│       ru_RU.dic
│
└───styles
    └───base
            ru-dictionary.yml

ru-dictionary.yml

extends: spelling
message: "'%s' is a typo"
dicpath: dicts
dictionaries:
  - ru_RU

Command

$ vale testdoc.rst

Result

E201 Invalid value provided [N:/repo/vale-linter/styles/base/ru-dictionary.yml:1:1]:

   1* extends: spelling
   2  message: "'%s' is a typo"
   3  dicpath: dicts

FLAG stanza not yet supported

Execution stopped with code 1.

If FLAG … stanza deleted…

E201 Invalid value provided [N:/repo/vale-linter/styles/base/ru-dictionary.yml:1:1]:

   1* extends: spelling                                                             
   2  message: "'%s' is a typo"                                                     
   3  dicpath: dicts                                                                

SFX stanza had 11 fields, expected 4 or 5

ru_RU.aff [1:20]

SET UTF-8
TRY оаитенрсвйлпкьыяудмзшбчгщюжцёхфэъАВСМКГПТЕИЛФНДОЭРЗЮЯБХЖШЦУЧЬЫЪЩЙЁ

FLAG long

SFX II Y 61
SFX II а и [гкхжчшщ]а #сорока --> сороки (е.ч.р.п. и мн.ч.и.п.+в.п.)
SFX II а ы [^гкхжчшщ]а #шуба --> шубы (е.ч.р.п. и мн.ч.и.п.+в.п.)
SFX II а е а #сорока --> сороке (е.ч.д.п.+п.п.)
SFX II а у а #сорока --> сороку (е.ч.в.п.)
SFX II а ам а #сорока --> сорокам (мн.ч.д.п.)
SFX II а ами а #сорока --> сороками (мн.ч.т.п.)
SFX II а ах а #сорока --> сороках (мн.ч.п.п.)
SFX II я и я #баня --> бани (е.ч.р.п.+д.п.(слова на -ия) и мн.ч.и.п.+в.п.)
SFX II я е [^и]я #баня --> бане (е.ч.д.п.+п.п.)
SFX II я ю я #баня --> баню (е.ч.в.п.)
SFX II я ей я #баня --> баней (е.ч.т1.п.)
SFX II я ею я #баня --> банею (е.ч.т2.п.)
SFX II я ям я #баня --> баням (мн.ч.д.п.)
SFX II я ями я #баня --> банями (мн.ч.т.п.)
jdkato commented 3 years ago

This is indeed a bug: the inline comments (the content after the #) is throwing the parser off.

cpkio commented 3 years ago

In 2.11.2 still says FLAG stanza not yet supported

jdkato commented 3 years ago

Yes, I haven't fixed this yet -- I'll update this issue when I do.

jdkato commented 3 years ago

This should be fixed now.

paddyroddy commented 3 years ago

I think this problem might still exist? I'm on v2.12.0 image

jdkato commented 3 years ago

Yes, I actually think that I introduced the bug above while attempting to fix the OP's bug.

The actual issue here appears less straightforward than I initially thought: there appears to be no official specification for comment syntax, so it seems to be impossible to handle this perfectly.

The best I can do is attempt to copy the actual behavior of Hunspell, but they also get this wrong in some cases:

$ echo "verbosirregulars" | hunspell -d an_ES
verbos irregulars, verbos-irregulars, # VERBOS IRREGULARS
jdkato commented 3 years ago

I think both cases reported here should be "working" now. Although, I think the implementation could use a general re-factor.

paddyroddy commented 3 years ago

looks to be working for me now 👍