microsoft / terminal

The new Windows Terminal and the original Windows console host, all in the same place!
MIT License
95.95k stars 8.35k forks source link

Ambiguous width character in CJK environment #370

Open ghost opened 5 years ago

ghost commented 5 years ago

The operation in the English environment is perfect. However, the behavior in the CJK environment is unstable. Type ☆, (\b), ☆, (\b), ☆ ... , because the sequence is insufficient, the character shifts one cell to the right.

c__windows_system32_cmd

I thought that it was my mistake, I tried drawing by querying the cursor position, but it could not be solved. Do you have any corrections?

k-takata commented 5 years ago

I hope this ConPTY issue will be fixed by next release of Windows (1903?).

ghost commented 5 years ago

image In this way, it is not under my control that the square cursor is displayed shifted to the right by half.

ghost commented 5 years ago

Will endeavor. The voyage to UTF-8 has just begun. Engage!

k-takata commented 5 years ago

@nak Why you close this issue?

I consider this is a bug of ConPTY. '☆' (U+2606) is an ambiguous width character and it is shown as full width in Japanese environment (cp932). Of course, it is shown as full width in the normal Command Prompt. However, on ConPTY '☆' is handled as a half width character (even it is shown in full width) and the cursor position and cursor width becomes weird. Also trying to delete '☆' by a Backspace doesn't work well. It only deletes a half part of the character. They should work as same as the normal Command Prompt.

ghost commented 5 years ago

Reopen it. And I will talk.

That processing at the command prompt is customization of Eastern Asia. Now, I think that it became the era when the user performs text rendering because of this implementation. I follow the implementation of the UNICODE expert. If this behavior is a bug, fix it as soon as problems arise.

I'd like to verify if this implementation is inconsistent, during that time, I kept it open all day, I thought it was missing courtesy. If you have problems, create an issue for it.

ghost commented 5 years ago

What moves right one cell at a time is that of Vim's renderer. Please ignore it now.

The system locale has no effect on the WSL Console. Even in the Windows domain, if you use ConPTY, the system locale behaves like Linux. ConPTY works like cmd.exe under Windows control. In Linux (WSL) control it behaves like bash. Can not you?

be5invis commented 5 years ago

So @miniksa @zadjii-msft Do we need a callback like HRESULT ConsoleTextShape(_In_ const WCHAR* text, _Inout_ IConsoleCellSink* sink) that lets the application that consumes ConPTY results to do the cell allocation?

miniksa commented 5 years ago

@be5invis, theoretically, perhaps. But the performance overhead of having that call go through interprocess communication for every single individual character would likely be prohibitive.

We don't know what the right answer is right now. We haven't been able to invest time in coming up with a more complete solution yet. We hope to one day.

sedwards2009 commented 5 years ago

@be5invis

The applications which output these characters can be on a totally different machine and operating system with respect to the application which consumes and displays the characters. A Windows specific function like ConsoleTextShape() could only solve the problem in the Windows + Windows case.

Ideally the unicode standard would state exactly how wide each character is to be when displayed in a monospace grid. Then we all just have to agree to follow the standard.

ghost commented 5 years ago

The current escape sequence, the absolute position of the screen is incorrect. Use the relative distance from the current cursor position. I want the API to redraw only the line with the cursor.

be5invis commented 5 years ago

@sedwards2009 One obviously complex case is the box-drawing characters: they are full-width under most far east locales, but half-width under others. Specifying a standard about how to properly shape text under a character grid is the ultimate goal, but providing shaping callbacks could become a valuable solution, since Windows console apps can directly access the character grid.

be5invis commented 5 years ago

@miniksa Not every character, but every string flush. The ConsoleTextShape will be called for each text flush. The sink will provide callbacks for associating a text slice with a cell run.

struct ConsoleShapingState {
    // Bidi level, etc.
}
class IConsoleCellSink {
public:
    virtual HRESULT acquireScroll(UINT rows) = 0;
    virtual HRESULT putCursor(UINT row, UINT column) = 0;
    virtual HRESULT putTextRun(UINT row, UINT column, UINT cells,
        UINT cch, const WCHAR* pwchText) = 0;
}

HRESULT CALLBACK ConsoleTextShape(
    _In_ UINT cch,
    _In_ const WCHAR* pwchText,
    _In_ UINT cellMatrixWidth,
    _In_ UINT cellMatrixHeight,
    _In_ UINT startCursorRow,
    _In_ UINT startCursorColumn,
    _Inout_ ConsoleShapingState* state,
    _Inout_ IConsoleCellSink* sink);
ghost commented 5 years ago

The implementation is almost complete. It can be used normally. https://github.com/ntak/vim-1/tree/control_ambiwidth

I am glad if ConPTY can be used casually.