Open anikitin opened 9 months ago
This answer is valid here.
@jaumeortola , thanks for your suggestion. I added the following rule to disambiguation.xml:
<rule id="IGNORE_SNAKE_CASE" name="ignore words in snake_case">
<pattern>
<token regexp="yes">\p{Ll}+_\p{Ll}.*</token>
</pattern>
<disambig action="ignore_spelling"/>
</rule>
It mostly works but I found a strange case where it failed to ignore snake case word:
"See the description of "ivr_pin" parameter" --> "Possible spelling mistake found."
at the same time:
"See the description of "new_pin" parameter" --> no matches.
(BTW, "ivrPin" in camelCase is correctly ignored by another rule you suggested)
Any ideas why it is happening?
If you are using English, the problem is the tokenization.
You will need a rule like this one:
<rule id="IGNORE_SNAKE_CASE" name="ignore words in snake_case">
<pattern>
<token regexp="yes">\p{Ll}+</token>
<token spacebefore="no">_</token>
<token regexp="yes" spacebefore="no">\p{Ll}.*</token>
</pattern>
<disambig action="ignore_spelling"/>
</rule>
This rule should also work for snake_case_with_multiple_underscores.
Rocket science! Thanks a lot @jaumeortola !
Another case I am currently struggling with is ignoring the entire word if it is in back quotes. It helps to avoid false positives about attributes like "ivr". Would you be so kind to recommend how to define pattern for this case? Should it look like
<token spacebefore="no">`</token>
<token regexp="yes">\p{Ll}+</token>
<token spacebefore="no">`</token>
Maybe this:
<token>`</token>
<token spacebefore="no" regexp="yes">\p{Ll}+</token>
<token spacebefore="no">`</token>
Thanks again, @jaumeortola. Works like a charm!
@jaumeortola, sorry to trouble you with questions again. But is there any simple way to exclude from spell check everything within backticks including the characters which are treated as delimiters like "=", "-", even spaces. I suspect that should be the similar construction as you suggested for underscore. Should I just list all such symbols non in the form of regexp but as individual tokens? Thanks!
It will be easier for me to understand the question with an example. Seeing a sentence example, I can tell you how to write the pattern.
Sure, it is all about some API usage fragments.
For example, I can have an embedded API call fragment within some markdown method description, e.g.
Ticks in markdown indicate a monospace fragment which is only used to highlight code fragments in our case. And these fragments need to be excluded from spell/grammar checks to reduce number of false positives.
Hope this explanation helps.
Thanks in advance!
@jaumeortola , any hints here ^^^. Thanks!
Something like this should cover most cases:
<rule>
<pattern>
<token>`</token>
<token spacebefore="no" skip="-1"><exception>`</exception><exception scope="next">`</exception> </token>
<token spacebefore="no">`</token>
</pattern>
<disambig action="ignore_spelling"/>
</rule>
Hmmm, @jaumeortola, doesn't work for me. For the line "Use pid=1
parameter", the result is "Possible spelling mistake found."
<rule id="IGNORE_WORDS_IN_BACKQUOTES" name="ignore words within backquote characters">
<pattern>
<token>`</token>
<token spacebefore="no" skip="-1"><exception>`</exception><exception scope="next">`</exception> </token>
<token spacebefore="no">`</token>
</pattern>
<disambig action="ignore_spelling"/>
</rule>
Some elements of the syntax don't work in the disambiguation file (because we have never needed it). The only quick solution is this. One rule for each pattern with a fixed number of tokens.
<rulegroup id="IGNORE_WORDS_IN_BACKQUOTES" name="ignore words within backquote characters">
<rule>
<pattern>
<token>`</token>
<token spacebefore="no"><exception>`</exception></token>
<token spacebefore="no">`</token>
</pattern>
<disambig action="immunize"/>
</rule>
<rule>
<pattern>
<token>`</token>
<token spacebefore="no"><exception>`</exception></token>
<token><exception>`</exception></token>
<token spacebefore="no">`</token>
</pattern>
<disambig action="immunize"/>
</rule>
<rule>
<pattern>
<token>`</token>
<token spacebefore="no"><exception>`</exception></token>
<token><exception>`</exception></token>
<token><exception>`</exception></token>
<token spacebefore="no">`</token>
</pattern>
<disambig action="immunize"/>
</rule>
<rule>
<pattern>
<token>`</token>
<token spacebefore="no"><exception>`</exception></token>
<token><exception>`</exception></token>
<token><exception>`</exception></token>
<token><exception>`</exception></token>
<token spacebefore="no">`</token>
</pattern>
<disambig action="immunize"/>
</rule>
</rulegroup>
I am using LanguageTool to verify API documentation. Some documentation fragments include variable names in snake_case.
It would be very helpful to have a speller option to ignore words in snake case similar to one for camel case, e.g.
fsa.dict.speller.ignore-snake-case