Wrong offset when using unicode in pipe -a mode

GNUAspell / aspell

http://aspell.net

GNU Lesser General Public License v2.1

243 stars 53 forks source link

Wrong offset when using unicode in pipe -a mode #275

Closed aspell-helper closed 7 years ago

aspell-helper commented 19 years ago

kgc kgc\@sf created a bug report on 2005-04-06 06:25:06 UTC (Orig. from https://sourceforge.net/p/aspell/bugs/134)

When using aspell through a pipe with unicode input, the offset is in bytes rather than in characters. This makes it difficult to use the offset for highlighting the misspelled word. With unicode input, try this:

run aspell -a input: a asdf output: * & asdf 61 2: as, ... input: ĺ asdf output: * & asdf 61 3: as, ...

So in the first case (no non-us/two-byte charaters) the offset is 2 (correct when counting characters 0-based), but in the second case, where there is a two-byte charater (a danish å), the offset is suddenly 3 (wrong in terms of counting characters).

This makes it rather difficult to highlight the correct word in an editor: You would have to reduce the offset with 1, for each two-byte character coming before the misspelled word on the line!

Best regards

Kasper

aspell-helper commented 19 years ago

Kevin Atkinson kevina\@sf commented on 2005-04-06 07:10:57 UTC

Logged In: YES user_id=6591

Make sure that the encoding is set to utf-8. Try "aspell -a --encoding=utf-8".

aspell-helper commented 19 years ago

kgc kgc\@sf commented on 2005-04-06 08:34:04 UTC

Logged In: YES user_id=1252374

I have encoding utf-8 set in my configuration-file, and on kevina's suggestion I have also tried setting it on the command line. Unfortunately it makes no difference.

If I set --encoding=iso8859-1 I get spelling-errors on correctly spelled words containing danish national characters, errors I doesn't get when using utf-8, so I'm pretty sure I'm selecting the right encoding.

BTW I'm using aspell 0.50.4.1 since it is the newest one I can find a rpm for RedHat 9 for. BTW I can't find a danish dictionary for anything newer than 0.50 series.

Best regards

Kasper

aspell-helper commented 19 years ago

Kevin Atkinson kevina\@sf updated the issue on 2005-04-06 08:42:07 UTC

status: open --> closed-invalid

aspell-helper commented 19 years ago

Kevin Atkinson kevina\@sf commented on 2005-04-06 08:42:07 UTC

Logged In: YES user_id=6591

Aspell 0.50.4.1 does not have proper utf-8 support. The 0.50 series dict. will work with Aspell 0.60

aspell-helper commented 19 years ago

kgc kgc\@sf commented on 2005-04-06 09:19:36 UTC

Logged In: YES user_id=1252374

Sorry for the trouble I've caused then. I had problems installing 0.60.2 and getting it to work with my dictionaries, so that is why I didn't do it before posting here. It turned out that it helped removing utf-8 from my .conf file while installing the dictionarie, so now I succeeded, and ideed: It works as expected.

Thank you for your time

Kasper