aotaduy / eslint-plugin-spellcheck

MIT License
183 stars 17 forks source link

Only show words that are likely to be misspellings #57

Open Timmmm opened 4 years ago

Timmmm commented 4 years ago

Currently this plugin takes the approach of a normal spell checker - check every word, if it's not in the acceptable word dictionary, reported it as an error.

That approach doesn't really work for programming. There are way too many identifiers and strings that contain words that aren't real English words. Here are some examples from running it over my codebase:

And so on. Hopefully you get the point. Adding all these words to skipWords is not really feasible.

I don't want a spell checker to tell me that auth should be authentication. I want it to tell me that sofware should be software (this is the actual typo in our code that finally pushed me to find this plugin!).

I think a much better approach would be to try to spell check each missing word, and then have some threshold for how different it is to the corrected word (e.g. the edit distance). There aren't any words similar to devtools so don't show that as an error, but sofware is only one character away from software, which is a real word, so that should be shown as an error.

aotaduy commented 4 years ago

Hi Tim, I{ll be adding some default skipWords and skipRegexps to the checker so we can address the isssues you mentioned I think this (https://gist.github.com/jrencz/becd1befac3f14b53039e0361087794c) is a good starting point and i will add those you mention to a test case se we can he it solved.

If you want/can you can create a pull request with a failing test so I can address it later.

I think the thresold is a biit more dificult to implement and it will miss some simple cases as you mentioned.

Tahnks for the issue, i wil hace it closed once its solved.

Timmmm commented 4 years ago

It might be worth checking out the Code Spell Checker extension for VSCode to see how they do it. I installed that after trying this plugin and it had far fewer false positives. It also integrates a bit better with VSCode - the autofix menu gives you spelling suggestions and lets you add words to the dictionary. I suppose it is easier for a VSCode extension to do that than an ESLint plugin.

But the advantage of an ESLint plugin is that I can force my teammates who can't spell to use it!

Tahnks for the issue

Heh :-D

aotaduy commented 4 years ago

Hey I have added an array of default skip words and regepxps, maybe you can create a PR with the words you need to be skipped by default. This is the file. https://github.com/aotaduy/eslint-plugin-spellcheck/blob/master/rules/defaultSettings.js

I also added all of the globals package variable names to the skipWords list In webstorn you can use some langague / framework specific dictionaries im planning to implement some similar funcionality in the future.

This could be useful so we can define this words on a a dictionary rather than in a js array. But for now this will work.

Thanks!

ghost commented 4 years ago

The defaults are good, but a few I've noticed should be added:

  1. camelcase (maybe all of the case names should be added)
  2. csv/xml/htm/txt (maybe a larger file extension database would be nice)
  3. commonjs (maybe we want JS module types too)
  4. webpack (maybe some of the top NPM module names?)
  5. whitespace
  6. utf
aotaduy commented 4 years ago

Hi clar-cmp Would you mind to create a pull request with some of those defaults? I'm not getting what you mean with the first pont and the fifth, do you want to add that particular word "camelcase" and "whitespace".

Thanks for your comment! If you can't create the pull request I'll add some of those in the next release, thanks!

DavidRieman commented 4 years ago

Just wanted to chime in that I don't think "auth" should be in the default exception list. It is often important to encourage developers to differentiate between "authenticated" versus "authorized" so it is generally best practice to spell it out, and a good habit to build even when a particular project never authenticates without also giving authorization, etc. I think that one is best left to local exceptions list when a project really wants to use just "auth"?

Timmmm commented 4 years ago

I think "auth" should be in the default exceptions list because it is spelt correctly. This is spell checker, not a variable-naming-best-practices checker. Also it's better that there are a few false negatives than many false positives otherwise it just becomes annoying and people won't use it (why I created this issue in the first place!).

ghost commented 4 years ago

@aotaduy I don't have the time to make a PR right now, but mostly wanted to add a few suggestions in case anyone else wants to do them. I can file separate issues if you want. :)

I figure that most of these would involve pulling in lists from other sources and using those to populate the lists, although I can see a case for including the lists directly. Considering how this is not intended to be included directly in any production code, I would push more towards using external libraries which contain lists of file types, etc.

aotaduy commented 4 years ago

Ok, no problem I will add those in next release thanks!

DavidRieman commented 4 years ago

"Auth" is literally not a word and thus is not a correctly spelled word. This is exactly what a spell-checker should catch.

(We use spell checkers in code not because it's required for functional code but for best practices reasons -- maintaining good, unambiguous spelling is a goal of spell checkers and following through on that is a best practice.)

I understand there is a way, without code changes, to list words you want ignored. Is there ability to list the reverse (to mark "auth" as a word from the default allow list that you no longer want to allow)? If not, seems simpler to just allow opt-in to allowing such a non-word, than to require others to make and maintain a code change to the tool in order to use it.

(All that said, I'm just lurking here and not using this tool ATM, so I don't really have a horse in this race today. I evaluated it ages ago and still hope to integrate it properly some day. Of course I'd like the tool to keep a shape that we'd find useful down that road.)

Timmmm commented 4 years ago

@DavidRieman Aside from the fact that abbreviations of words can also be words, spell checkers are not "is it a word" checkers. They are intended to find accidental misspellings. Should it say these are spelt incorrectly too?

I don't think that would be very useful.

ghost commented 4 years ago

I would personally argue that the term regexp should be used in the context of JavaScript because that's what JS uses for its class, but agree that what's a word should be broadened to include common abbreviations.

aotaduy commented 4 years ago

Well, I wouldn't agree on adding many abbreviations on the default skipWords / skipIfMatch set, but you are free to add it on your own configuration. I think a spell checker is quite a bit personal (for example i don't like any kind of abbreviations in my code). About the specific discussion im not really sure about auth, google says is a noun. imagen