Open GoogleCodeExporter opened 9 years ago
What steps will reproduce the problem? 1. Launch WinGHCi 2. Enter "x <- getLine" 3. Enter or paste "résumé 履歴書 резюме" into editor 4. Enter "putStrLn x" The expected output is "résumé 履歴書 резюме". Instead, "rГ©sumГ© 履жґж›ё резюме" is shown. I am using Haskell Platform 2012.4.0.0, with GHC 7.4.2, WinGHCi 1.0.6 on Windows 7 x64, with CP_ACP set to CP1251 and CP_OEMCP set to CP866 (Windows Cyrillic). The explanation is as follows: WinGHCi merely runs a copy of ghci and passes user input to the ghci's stdin and reads feedback from the ghci's stdout. To do this, WinGHCi first sets the codepages for the ghci's console as ACP. Sending command to ghci is done via converting user input (which is in native Unicode encoding) to UTF8, and then passing the result byte string to ghci's stdin. ghci, however, will intepret this input as being in ACP, not in UTF8, which may lead to data corruption. Receiveing ghci's output, however, is done assuming that ghci's stdout yields data in ACP, which doesn't cause any additional data corruption. Proposed fix: make interaction between WinGHCi and ghci happen copletely in UTF8. It can be done as follows: 1. In file StartGHCI\StartGHCI.c, lines 65 and 66 must be replaced with this: SetConsoleOutputCP(CP_UTF8); SetConsoleCP(CP_UTF8); 2. In file Utf8.c, line 138 must be replace with this: INT res = MultiByteToWideChar(CP_UTF8, 0, strIn, lenIn, strOut, maxWChars); After recompiling, the steps 1-4 produce intented result: "résumé 履歴書 резюме". The proposed patch is attached as a "Unified DIFF" file, generated with Git's help.
Original issue reported on code.google.com by Joker...@gmail.com on 21 Feb 2013 at 12:59
Joker...@gmail.com
Attachments:
Original issue reported on code.google.com by
Joker...@gmail.com
on 21 Feb 2013 at 12:59Attachments: