Closed aspell-helper closed 7 years ago
Kevin Atkinson kevina\@sf commented on 2005-04-06 07:10:57 UTC
Logged In: YES user_id=6591
Make sure that the encoding is set to utf-8. Try "aspell -a --encoding=utf-8".
kgc kgc\@sf commented on 2005-04-06 08:34:04 UTC
Logged In: YES user_id=1252374
I have encoding utf-8 set in my configuration-file, and on kevina's suggestion I have also tried setting it on the command line. Unfortunately it makes no difference.
If I set --encoding=iso8859-1 I get spelling-errors on correctly spelled words containing danish national characters, errors I doesn't get when using utf-8, so I'm pretty sure I'm selecting the right encoding.
BTW I'm using aspell 0.50.4.1 since it is the newest one I can find a rpm for RedHat 9 for. BTW I can't find a danish dictionary for anything newer than 0.50 series.
Best regards
Kasper
Kevin Atkinson kevina\@sf updated the issue on 2005-04-06 08:42:07 UTC
Kevin Atkinson kevina\@sf commented on 2005-04-06 08:42:07 UTC
Logged In: YES user_id=6591
Aspell 0.50.4.1 does not have proper utf-8 support. The 0.50 series dict. will work with Aspell 0.60
kgc kgc\@sf commented on 2005-04-06 09:19:36 UTC
Logged In: YES user_id=1252374
Sorry for the trouble I've caused then. I had problems installing 0.60.2 and getting it to work with my dictionaries, so that is why I didn't do it before posting here. It turned out that it helped removing utf-8 from my .conf file while installing the dictionarie, so now I succeeded, and ideed: It works as expected.
Thank you for your time
Kasper
kgc kgc\@sf created a bug report on 2005-04-06 06:25:06 UTC (Orig. from https://sourceforge.net/p/aspell/bugs/134)
When using aspell through a pipe with unicode input, the offset is in bytes rather than in characters. This makes it difficult to use the offset for highlighting the misspelled word. With unicode input, try this:
run aspell -a input: a asdf output: * & asdf 61 2: as, ... input: ĺ asdf output: * & asdf 61 3: as, ...
So in the first case (no non-us/two-byte charaters) the offset is 2 (correct when counting characters 0-based), but in the second case, where there is a two-byte charater (a danish å), the offset is suddenly 3 (wrong in terms of counting characters).
This makes it rather difficult to highlight the correct word in an editor: You would have to reduce the offset with 1, for each two-byte character coming before the misspelled word on the line!
Best regards
Kasper