rojopolis / spellcheck-github-actions

Spell check action
MIT License
132 stars 38 forks source link

Problem with UTF-8 character in wordlist file #121

Open jonasbn opened 1 year ago

jonasbn commented 1 year ago

I am observing an issue with the action in the repository jonasbn/perl-task-date-holidays

The word: Rezić is reported as a spelling mistake even when listed in the word list file (.wordslist.txt).

REF: relevant jonasbn/perl-task-date-holidays@150683d26f8dfdc07d1f97a07991b506416d0cfc of jonasbn/perl-task-date-holidays/.wordlist.txt as head has been altered.

This is the configuration:

matrix:
- name: Markdown
  aspell:
    lang: en
    ignore-case: true
  dictionary:
    wordlists:
    - .wordlist.txt
    encoding: utf-8
  pipeline:
  - pyspelling.filters.markdown:
  - pyspelling.filters.html:
      comments: false
      ignores:
      - code
      - pre
  sources:
  - '**/*.md'
  default_encoding: utf-8

REF: perl-task-date-holidays/.spellcheck.yaml

facelessuser commented 1 year ago

Try specifyingRezic in your English dictionary. It may simply be due to how ASPELL normalizes characters in an English dictionary.

jonasbn commented 1 year ago

Thanks @facelessuser I will try that

facelessuser commented 1 year ago

I'm kind of digging into the settings. I mainly use English words, so I don't have experience often with using some foreign words and such, so I haven't dug into all the Unicode normalization options and such. There may be an even better approach, but I may have to play around to see what that is.

facelessuser commented 1 year ago

From the Aspell documentation:

If a word contains a character that the language can’t handle it will still be ignored (for example a Cyrillic letter in a Latin based language).

I imagine this may simply be an issue of using certain characters within an English dictionary.