Open stesachse opened 7 years ago
I have a similar problem. The german Umlaute are encoded correctly, but they cannot be matched to my text when I add them to the list of known words. They still show up as misspelled.
I have same problem with Polish language on Linux Mint 18.3
I had the same problem and solved it by saving the .aff
and .dic
files with utf-8 encoding.
I can confirm what @Qrizzz found. Converting both files to UTF-8 with enca
fixed this problem.
I had the same problem and solved it by saving the .aff and .dic files with utf-8 encoding.
That works indeed!
But it might not be immediately clear what one has to do exactly to achieve this. Therefore the following step-by-step workaround for (Swiss) German Linux/Ubuntu users (type all the commands into a standard terminal):
List all the system-wide installed hunspell dictionaries:
hunspell -D
You might have to install the package hunspell
beforehand.
Create a new directory for the custom dictionary files with UTF-8 encoding; I'd recommend:
mkdir ~/.atom/custom_spellchecker_dictionaries/
Of course you could also directly convert the original dictionary files. But since I don't know what potential side effects in conjunction with other programs that could have, I wouldn't recommend it.
Create UTF-8 versions of all the relevant dictionary files in the new directory. Example for (Swiss) German and English dictionaries:
iconv -f ISO-8859-1 -t UTF-8 /usr/share/hunspell/de_CH.aff | sed 's/^SET ISO8859-1$/SET UTF-8/g' > ~/.atom/custom_spellchecker_dictionaries/de_CH.aff
iconv -f ISO-8859-1 -t UTF-8 /usr/share/hunspell/de_CH.dic > ~/.atom/custom_spellchecker_dictionaries/de_CH.dic
iconv -f ISO-8859-1 -t UTF-8 /usr/share/hunspell/de_DE.aff | sed 's/^SET ISO8859-1$/SET UTF-8/g' > ~/.atom/custom_spellchecker_dictionaries/de_DE.aff
iconv -f ISO-8859-1 -t UTF-8 /usr/share/hunspell/de_DE.dic > ~/.atom/custom_spellchecker_dictionaries/de_DE.dic
iconv -f ISO-8859-1 -t UTF-8 /usr/share/hunspell/en_US.aff | sed 's/^SET ISO8859-1$/SET UTF-8/g' > ~/.atom/custom_spellchecker_dictionaries/en_US.aff
iconv -f ISO-8859-1 -t UTF-8 /usr/share/hunspell/en_US.dic > ~/.atom/custom_spellchecker_dictionaries/en_US.dic
You might have to adjust the paths to the ouptut of hunspell -D
; the above are valid in Ubuntu 16.04 LTS.
Set the path to the new folder in the option Locale Paths
of the Atom spell-check
package (note that you can only use absolute paths, so no ~
shortcut). If you followed the recommendation under 1), the path would be: /home/USERNAME/.atom/custom_spellchecker_dictionaries/
.
Restart Atom.
BTW: This issue has been opened over a year ago. Why hasn't this been fixed yet? I guess spell-check
should just read the .dic
and .aff
files in their correct encoding and everything would be fine, right? As the Chromium documentation suggests, Atom could
search in the .aff file for the line that begins with "SET" to see which character set it uses.
I have the same issue as @laniley: words with Umlauts that has been added to the list of known words, are still not recognized.
Same thing under Arch Linux (64 bit), Atom 1.23.2 x64, Spell Check Package 0.73.3
I also had this problem under Ubuntu 17.10 (Atom 1.23.3) and what @salim-b recommended above worked perfectly. Thanks!
Same for Ubuntu 17.10 (Atom 1.24.1); the solution proposed by @salim-b works.
@salim-b's workaround doesn't work for me. Some suggestions are now completely ignored.
The original aff file was also encoded as ISO8859-1, but when converted to UTF-8 the spell checker interprets c and ç as the same letter.
Atom : 1.26.0 Electron: 1.7.11 Chrome : 58.0.3029.110 Node : 7.9.0
I'm on Linux Mint 18.3 (based on Ubuntu 16.04) and salim-b's workaround worked fine for me.
Although I have to say it would make for a much better experience if it "just worked" without having to apply a workaround yourself.
I can also confirm this behavior on openSUSE Leap 42.3. I am reluctant to just convert my dictionaries. Plenty of other software is relying on them. Besides, it will break after the next update ...
Tracing the issue back to its roots, it's likely originating in a dependency of spell-check
: node-spellchecker
. See issue 77 in this project for details.
What is the status on this? I am running up against the problem where umlauts are rendered properly, but spell-check always thinks words with them are misspelled, despite being a part of the dictionary. For me it is the word Grüneisen, which unfortunately appears everywhere in my work.
Is there any chance of fixing this? I am running atom on a mac OS 10.12.6. Thanks!
@aswolf The root cause of this issue has not been solved yet, see latest comment there. Looks like someone there could use some help from an experienced C++ coder with access to a Mac.
Same problem. Solved with salim-b https://github.com/atom/spell-check/issues/161#issuecomment-336653098
there is something wrong with the encoding. the wrong word is marked correctly. for me this is more than the original spell-check package has ever done. so thanks for the work :) but in the correction view there are only wrong encoded "umlauts" all marked with a question mark. i can select every entry to replace the wrong spelled word. but the replacement has also the wrong encoded "umlauts".
i have tried to decode this as utf8 or iso-8859-15 but the result is always garbage. here is what perl says. but maybe copy&paste doesn't work at all for this broken string.
but perhaps this is correct, because the string keeps the same while doing the following: copying the string out of the editor into the console, run the perl one-liner and copying it back into the editor.
the dev-tools shows no error messages
before correction
after correction
here is the spell-check config