ajinux / winghci

Automatically exported from code.google.com/p/winghci
Other
0 stars 0 forks source link

Lousy Unicode support in WinGHCi and fix for it #4

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Launch WinGHCi
2. Enter "x <- getLine"
3. Enter or paste "résumé 履歴書 резюме" into editor
4. Enter "putStrLn x"

The expected output is "résumé 履歴書 резюме". Instead, 
"résumé 履歴書 резюме" is shown.

I am using Haskell Platform 2012.4.0.0, with GHC 7.4.2, WinGHCi 1.0.6 on 
Windows 7 x64, with CP_ACP set to CP1251 and CP_OEMCP set to CP866 (Windows 
Cyrillic).

The explanation is as follows: WinGHCi merely runs a copy of ghci and passes 
user input to the ghci's stdin and reads feedback from the ghci's stdout. To do 
this, WinGHCi first sets the codepages for the ghci's console as ACP. Sending 
command to ghci is done via converting user input (which is in native Unicode 
encoding) to UTF8, and then passing the result byte string to ghci's stdin. 
ghci, however, will intepret this input as being in ACP, not in UTF8, which may 
lead to data corruption. Receiveing ghci's output, however, is done assuming 
that ghci's stdout yields data in ACP, which doesn't cause any additional data 
corruption.

Proposed fix: make interaction between WinGHCi and ghci happen copletely in 
UTF8. It can be done as follows:

1. In file StartGHCI\StartGHCI.c, lines 65 and 66 must be replaced with this:
    SetConsoleOutputCP(CP_UTF8);
    SetConsoleCP(CP_UTF8);

2. In file Utf8.c, line 138 must be replace with this:
        INT res = MultiByteToWideChar(CP_UTF8, 0, strIn, lenIn, strOut, maxWChars);

After recompiling, the steps 1-4 produce intented result: "résumé 履歴書 
резюме".

The proposed patch is attached as a "Unified DIFF" file, generated with Git's 
help.

Original issue reported on code.google.com by Joker...@gmail.com on 21 Feb 2013 at 12:59

Attachments: