streetsidesoftware / vscode-spell-checker

A simple source code spell checker for code
https://streetsidesoftware.github.io/vscode-spell-checker/
Other
1.44k stars 130 forks source link

Ignore words with non-latin characters #458

Open alystair opened 4 years ago

alystair commented 4 years ago

image

Obviously Spanish and other pure latin char languages would have to be ignored manually by user, but Russian and other languages that use non-latin characters should automatically be ignored unless we're using that language dictionary?

johnml1135 commented 9 months ago

I was able to get around it with this:

  "cSpell.includeRegExpList": [
    "\b[a-zA-Z0-9.]+\b"
  ],
Jason3S commented 9 months ago

It is necessary to explicitly ignore character sets. By default, the spell checker checks all text.

It is possible to tell the spell checker to ignore a character set using the ignoreRegExpList or only include text that matches expressions in includeRegExpList.

The spell checker uses JavaScript's builtin regexp engine. To use Unicode matching the u flag needs to be added.

It is also necessary to specify Script_Extensions= when using script names. See: Unicode character class escape: \p{...}, \P{...} - JavaScript | MDN. It is always best to try out expressions at regex101: build, test, and debug regex.

Using directive within a document

// cspell:ignoreRegExp /[\p{Script_Extensions=Cyrillic}]+/gu
image

VS Code Settings

.vscode/settings.json

  "cSpell.ignoreRegExpList": ["/[\\p{Script_Extensions=Cyrillic}]+/gu"]

Using CSpell config

cspell.json

{
  "ignoreRegExpList": ["/[\\p{Script_Extensions=Cyrillic}]+/gu"]
}

cspell.config.yaml

ignoreRegExpList": 
  - '/[\p{Script_Extensions=Cyrillic}]+/gu'

List of Character sets

Useful reference: Unicode Scripts

List