Open burner opened 17 years ago
Created attachment 172 Small test cae for the same problem in DMC
The problem doesn't show if I use the Windows API (either WriteConsole or WriteFile) to output. So the bug must be somewhere in DM's stdio implementation.
Fixed dmd 1.021 and 2.004
The problem was NOT fixed for stderr (DMD 1.022)
Bug 1608 has been marked as a duplicate of this bug.
I hope this gets fixed one day. Here is an updated example, where it still doesn't work (for stderr, stdout is ok) as of DMD 1.035
import std.c.stdio; import std.c.windows.windows;
extern(Windows) export BOOL SetConsoleOutputCP( UINT );
void main() { SetConsoleOutputCP( 65001 ); // or use "chcp 65001" instead // Codepoint 00e9 is "Latin small letter e with acute" fputs("Output utf-8 accented char \u00e9\n... and the rest is OK\n", stdout); fputs("Output utf-8 accented char \u00e9\n... and the rest is cut off!\n", stderr); fputs("STDOUT.\n", stdout); fputs("STDERR.\n", stderr); }
Sort of works for me.
The text doesn't get cut off, but the unicode characters don't get displayed either.
C:\Users\Kevin\Documents\D Projects\ConsoleApp1\ConsoleApp1\bin>ConsoleApp1.exe Output utf-8 accented char é ... and the rest is OK Output utf-8 accented char �� ... and the rest is cut off! STDOUT. STDERR.
C:\Users\Kevin\Documents\D Projects\ConsoleApp1\ConsoleApp1\bin>
Status update as of DMD 2.062 (Win XP 32 bit)
Still the same error for the above mentioned example, however, when modified to use write instead of fputs:
import std.stdio; import std.c.windows.windows;
extern(Windows) BOOL SetConsoleOutputCP( UINT );
void main() { SetConsoleOutputCP( 65001 ); // or use "chcp 65001" instead stderr.write("STDERR:Output utf-8 accented char \u00e9\n... and the rest is cut off!\n"); stderr.write("end_STDERR.\n"); }
I get this error:
So if anybody have a clue what's going on there...
I can confirm this issue. When enumerating a directory (via dirEntries()) containing a file with a character in the CP850/CP1252 space (e.g. "säb"), depending on the codepage settings, the output is as follows:
chcp 1252 => output is "säb" (Unicode encoding for "ä") chcp 65001 => output is "säbstd.exception.ErrnoException@D:\tools\d\bin..\src\phobos\std\stdio.d(1352): (No error)"
In both cases e.g. cmd's dir shows the correct results. The correct results are also shown when using - not really comparable - C with printf().
Tried the case in cmd, console2, and conemu. All show the same results.
It'd really be nice if this bug would get fixed...
Addendum: Windows 7 64-bit, dmd v2.063.2.
Sorry.
Hallelujah, this (comment 8) seems fixed, finally. Can anybody confirm ? Works for me on Windows XP 32 bit, dmd 2.065.0
Beware, fputs still doesn't work. I think it's C library problem.
The issue still exists in DMD32 D Compiler v2.065, Windows 7
import std.stdio; import std.c.windows.windows;
extern(Windows) BOOL SetConsoleOutputCP( UINT );
STDERR:Output utf-8 accented char é ... and the rest is cut off!
==============
end_STDERR.\n is not written
Final note, as this is unlikely to be fixed: use -m32mscoff and Microsoft VS linker.
Partial fix or workaround in druntime for unhandled exceptions: https://github.com/dlang/druntime/pull/1687
Still an issue, but apparently restricted to stderr (and independent from DigitalMars/MS runtime):
import core.stdc.stdio;
import core.sys.windows.wincon, core.sys.windows.winnls;
void main()
{
const oldCP = SetConsoleOutputCP(CP_UTF8);
scope(exit) SetConsoleOutputCP(oldCP);
fprintf(stdout, "HellöѬ LDC\n");
fflush(stdout);
fprintf(stderr, "HellöѬ LDC\n");
fflush(stderr);
}
=>
HellöѬ LDC
Hell
Tested with DMD 2.086.0 (-m32, -m32mscoff, -m64) and LDC on Win10.
Update: it's working with Win10 v1903 (with the exact same binary that didn't work with v1803). According to Rainer Schütze, it's working since v1809. See https://devblogs.microsoft.com/commandline/windows-command-line-unicode-and-utf-8-output-text-buffer/.
(In reply to kinke from comment #16)
Update: it's working with Win10 v1903 (with the exact same binary that didn't work with v1803). According to Rainer Schütze, it's working since v1809. See https://devblogs.microsoft.com/commandline/windows-command-line-unicode-and- utf-8-output-text-buffer/.
So is this issue fixed? I don't have a windows machine to test it. Should we close this?
This isn't solved, but would now be solvable with recent Windows versions.
There are 2 things about this:
(In reply to kinke from comment #18)
This isn't solved, but would now be solvable with recent Windows versions.
There are 2 things about this:
- DMD outputs a mix of UTF-8 and strings in the current codepage, AFAIK without setting any console codepage, so DMD output on Windows can be garbage. LDC v1.17 fixes this for LDC.
How does LDC solve the problem?
- User programs writing UTF-8 strings to the console suffer from the same issue. This could be worked around by setting the console codepage in druntime's _d_run_main and resetting it to the original one before termination.
a.solovey reported this on 2007-08-28T22:51:06Z
Transfered from https://issues.dlang.org/show_bug.cgi?id=1448
CC List
Description
If windows console code page is set to 65001 (UTF-8) and program outputs non-ascii characters in UTF-8 encoding, there will be no more output after the first new line after accented character. I believe that problem is in underlying DMC stdio, but it is more disturbing with D as it has good Unicode support and it is very convenient to work international texts in it. This problem has been reported in newsgroup several times before, see for example http://www.digitalmars.com/d/archives/digitalmars/D/announce/openquran_v0.21_8492.html Here is the code to illustrate the problem: //////// import std.c.stdio; import std.c.windows.windows;
extern(Windows) export BOOL SetConsoleOutputCP( UINT );
void main() { SetConsoleOutputCP( 65001 ); // or use "chcp 65001" instead // Codepoint 00e9 is "Latin small letter e with acute" puts( "Output utf-8 accented char \u00e9 ... and the rest is cut off! " ); } ///////// If you run it, "... and the rest is cut off!" won't be displayed. Do not forget to set console font to Lucida Console before trying this.
!!!There are attachements in the bugzilla issue that have not been copied over!!!