Encoding problem - Githubissues

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1. Codepage ISO-8859-9: 1999(Latin5, Turkish)

What is the expected output? What do you see instead?
i can't see turkish caracters çüğöş

What version of the product are you using? On what operating system?
4.0 and 4.1

Please provide any additional information below.
3.10 is working and i can see çüğöş

Original issue reported on code.google.com by leventd...@gmail.com on 23 Jun 2009 at 12:46

GoogleCodeExporter commented 9 years ago

[deleted comment]

GoogleCodeExporter commented 9 years ago

[deleted comment]

GoogleCodeExporter commented 9 years ago

Tracked down the bug to r352: a 0x7F in place of a 0xFF meant that output 
characters 
in that range were treated as Unicode, and hence ISO-8859-1. Fixed in r395 on 
0.4 
branch.

Original comment by andy.koppe on 23 Jun 2009 at 4:56

Added labels: Difficulty-Trivial

GoogleCodeExporter commented 9 years ago

[deleted comment]

GoogleCodeExporter commented 9 years ago

i think that's not fixed

Original comment by leventd...@gmail.com on 24 Jun 2009 at 3:07

Attachments:

ss.png

GoogleCodeExporter commented 9 years ago

Well, I certainly did fix one problem. Whereas previously pasting çüğöş 
resulted in
çüðöþ (i.e. Icelandic characters instead of Turkish ones), that now does 
the right thing.

I guess the problem in the screenshot is that you expect ı (the dotless i) 
instead of
ý? That works here, at least in the command line (with readline set to 8-bit 
clean
mode) and also in nano and vi.

Questions:
1) What do you get when you enter ı in nano or vi?
2) What's the application in the screenshot?
3) Do you get correct behaviour in mintty-0.3.10 or other terminals?
4) Are you sure the font you've selected supports Turkish characters?

Original comment by andy.koppe on 24 Jun 2009 at 6:45

GoogleCodeExporter commented 9 years ago

1) i see çüðöþ in vim
2) it's mutt
3) yes i have correct behaviour in mintty-0.3.10
4) yes i have and i try also with lucida console
and i can see correct caracters with 0.4.0-r2 and can't 0.4.0-r3

Original comment by leventd...@gmail.com on 24 Jun 2009 at 8:49

Attachments:

sswindow.png

GoogleCodeExporter commented 9 years ago

With r395 I can't reproduce this either on Cygwin 1.5 or 1.7, including on 
Windows 7,
so I'm getting a bit desperate here. Could you try the attached MinTTY, and 
check
your codepage setting again?

If it still fails, please attach your .minttyrc. Also, have you got any of the 
LANG,
LC_TYPE or LC_ALL environment variables set? Finally, are there any terminal 
control
sequences for switching character sets in your PS1 or elsewhere in your shell 
startup
scripts?

Original comment by andy.koppe on 25 Jun 2009 at 6:33

Removed labels: Difficulty-Trivial

Attachments:

GoogleCodeExporter commented 9 years ago

Hang on, the MinTTY in your screenshot was from SVN trunk rather than the 0.4 
branch,
which is where the problem it's fixed. The version should say "svn-0.4-r395". 
Let me
know how you get on though, and sorry for not spotting this sooner.

Original comment by andy.koppe on 25 Jun 2009 at 7:22

GoogleCodeExporter commented 9 years ago

ok my fault :( fixed now
thanks for help

Original comment by leventd...@gmail.com on 25 Jun 2009 at 9:00

GoogleCodeExporter commented 9 years ago

Original comment by andy.koppe on 28 Jun 2009 at 10:52

Changed state: Verified
Added labels: Difficulty-Easy

GoogleCodeExporter commented 9 years ago

i have same problem on svn/trunk

Original comment by leventd...@gmail.com on 6 Sep 2009 at 4:15

GoogleCodeExporter commented 9 years ago

Works here. Can you verify that the "Character set" on the "Text" page is 
indeed set
to "ISO-8859-9"?

(The 0.4 "Codepage" setting is ignored. Probably not a good idea?)

Original comment by andy.koppe on 6 Sep 2009 at 4:33

GoogleCodeExporter commented 9 years ago

yes, i did double check,

Original comment by leventd...@gmail.com on 6 Sep 2009 at 4:42

Attachments:

ssclient.png

GoogleCodeExporter commented 9 years ago

Hmm, right.

More questions:
- What do you get in the relevant positions such as 0xFD if you run the 'ascii' 
tool?
- Does it fail in editors as well?
- Does invoking a new MinTTY make a difference?
- What does this command show:
  echo LC_ALL=$LC_ALL LC_CTYPE=$LC_CTYPE LANG=$LANG
- Could you attach your .minttyrc?

Original comment by andy.koppe on 6 Sep 2009 at 5:06

GoogleCodeExporter commented 9 years ago

- ı, is working
- no, in vim working
- no
- limon@kurbaga ~ $ echo LC_ALL=$LC_ALL LC_CTYPE=$LC_CTYPE LANG=$LANG
LC_ALL= LC_CTYPE= LANG=tr_TR.ISO-8859-9
i think it's a mutt problem :(

Original comment by leventd...@gmail.com on 6 Sep 2009 at 5:27

Attachments:

minttyrc

GoogleCodeExporter commented 9 years ago

Thanks. Yes, looks like a mutt problem then. Is that on Cygwin 1.5 or 1.7?

It might be activating UTF-8 mode in the terminal (using the "\e%G" control
sequence), but not picking up the locale setting from LANG. Perhaps try it with
'LC_CTYPE=tr_TR.ISO-8859-9'?

Original comment by andy.koppe on 6 Sep 2009 at 5:35

GoogleCodeExporter commented 9 years ago

1.7, however i'm trying utf-8 but i can't see çşğü with ascii tool, but 
working with 
vim.

Original comment by leventd...@gmail.com on 6 Sep 2009 at 5:39

Attachments:

ssclient.png

GoogleCodeExporter commented 9 years ago

Yep, the 'ascii' tool doesn't speak UTF-8. It simply prints each character as 
one
byte, whereas with UTF-8, characters above 0x7F need to be encoded as two-byte
sequences. Those single bytes above 0x7F are printed as UTF-8 encoding errors.

Original comment by andy.koppe on 6 Sep 2009 at 5:54

GoogleCodeExporter commented 9 years ago

when select UTF-8

M-CM-< -> ü
M-CM-' -> ç

do you have any idea?

Original comment by leventd...@gmail.com on 6 Sep 2009 at 2:25

GoogleCodeExporter commented 9 years ago

Is that in any particular program? Or bash?

I presume M is Meta, i.e. Alt, but what's CM? Also, which keyboard layout are 
using?
Turkish Q?

Original comment by andy.koppe on 6 Sep 2009 at 5:06

GoogleCodeExporter commented 9 years ago

in mutt and snownews, i'm using US keyboard layout.
Do you use any ncurses or iconv program in cygwin?

Original comment by leventd...@gmail.com on 6 Sep 2009 at 5:24

GoogleCodeExporter commented 9 years ago

Just tried this as well (in nano) and got the same effect. Turned out Turkish Q 
was
enabled after all. Looks like the keyboard selection in Windows is 
program-specific,
i.e. when trying to change it to Turkish/US I didn't have the mintty window 
active.

Also, Ctrl+Shift or Alt+Shift might be configured as shortcuts for switching 
layout,
under "Advanced Key Settings" in the keyboard layout control panel.

Besides, it looks like UTF-8 isn't supported in nano (and hence ncurses). :(

Original comment by andy.koppe on 6 Sep 2009 at 6:07

GoogleCodeExporter commented 9 years ago

too bad :(, what can i do now? any suggestion?

Original comment by leventd...@gmail.com on 6 Sep 2009 at 7:22

GoogleCodeExporter commented 9 years ago

Report the issue to the Cygwin mailing list? Especially if it did work 
correctly on
1.5. Btw, how did you get ISO-8859-9 working in mutt there?

Meanwhile, I found this in an ncurses announment from March, at
http://sourceware.org/ml/cygwin/2009-03/msg00243.html:

"Future development may add re-entrancy/multithread support, and wide
character (UTF) support. However, those changes will require (another)
ABI change and DLL name modification, and will not occur prior to
cygwin-1.7's official launch."

Original comment by andy.koppe on 6 Sep 2009 at 10:11

GoogleCodeExporter commented 9 years ago

i set charset=ISO-8859-9//TRANSLIT in my .muttrc, now i set charset=us-ascii//
TRANSLIT and transforming turkish characters into nearly view.
ö -> "o
ü -> "u

Original comment by leventd...@gmail.com on 7 Sep 2009 at 2:51

GoogleCodeExporter commented 9 years ago

One more idea, after looking at muttrc(5). With your mintty charset back to
ISO-8859-9, perhaps you need a "hook" to map lowercase MIME charset 
'iso-8859-9' to
uppercase locale/iconv charset 'ISO-8859-9'.

       charset-hook alias charset
              This  command defines an alias for a character set.  This is useful
              to properly display messages which are tagged with a character  set
              name not known to mutt.

       iconv-hook charset local-charset
              This  command  defines  a system-specific name for a character set.
              This is useful when your system’s iconv(3) implementation does  not
              understand  MIME  character  set  names  (such  as iso-8859-1), but
              instead insists on being fed with implementation-specific character
              set  names (such as 8859-1).  In this specific case, you’d put this
              into your configuration file:

              iconv-hook iso-8859-1 8859-1

Otherwise, I think it's clear that this isn't a mintty issue, and I don't 
really know
anything about mutt. Hence you'd be more likely to get a useful answer from the 
mutt
maintainer on the Cygwin mailing list or the Mutt mailing list itself.

Original comment by andy.koppe on 7 Sep 2009 at 11:35

GoogleCodeExporter commented 9 years ago

thanks for help, i recompile ncurses with utf-8 support, and recompile mutt 
with 
ncursesw :)
now i switch to utf-8, and everything is working smoothly.

Original comment by leventd...@gmail.com on 7 Sep 2009 at 12:12

GoogleCodeExporter commented 9 years ago

Cool! Glad to hear it can be done.

Original comment by andy.koppe on 7 Sep 2009 at 12:16

suzdraws / mintty

Encoding problem #124