Open MahdiGhiasi opened 5 years ago
30 years and we still can't render Arabic characters inside Windows terminal, while Ubuntu and Mac can do !
30 years and we still can't render Arabic characters inside Windows terminal, while Ubuntu and Mac can do !
@jalchr Thanks for reaching out. Comments like this are not substantive and do not add to the discussion.
Thanks for getting the notice. I have been reading the issues and people are shouting out for the Arabic support, while the responses were not encouraging. We are building a new terminal right? It should be at least like if not better than what is already on the table
We're all for adding RTL support, but know that doing that would require a lot of work.
Much of the new Terminal's codebase is directly shared with the old console codebase, and there are lots of places in that code that assume the text is all LTR. Any fixes for the Terminal would need to make sure to not break existing console behavior in this regard.
I'm not saying it's impossible to support RTL, or that we won't. I'm just saying that based on the volume of work it'd require, it's probably not a 1.0 feature, unless we get a lot of help from the community.
It might suffice for v1.0 to at least render Arabic characters correctly
Old codebases aren't that bad 😁:
while Ubuntu and Mac can do
Ubuntu doesn't refer to a particular terminal emulator. Some of its terminal emulators (e.g. konsole, pterm, mlterm) do some right-to-left rendering, and so does macOS's Terminal.app.
And they all do it wrong.
Just to give you an idea, let me mention here one of the many problems they all suffer from. They unconditionally "reverse" every Farsi/Arabic/Hebrew/etc. text. This fixes them in the command line and the in the output of simple utilities, and at the same time, hopelessly, unfixably breaks them for more complex applications (e.g. Emacs).
See my work at https://terminal-wg.pages.freedesktop.org/bidi/ (along with a WIP implementation in VTE / GNOME Terminal) about a more detailed description of the problems and a proposed specification for the desired behavior.
Added this issue in my gist to track: https://gist.github.com/XVilka/a0e49e1c65370ba11c17
Partial fix, in #1873
Note, that with the release of GNOME 3.34 the support of BiDi by @egmontkob is available in Gnome Terminal out of the box, which makes comparing your implementation and testing it against valid one way easier.
For the record, I don't think the VTE proposal is the right approach, at least not as a default behaviour. In my opinion we should be doing our best to match the way the DEC terminals functioned, implementing the DEC RTL modes and escape sequences, and making sure we work with existing RTL/bidirectional applications.
It's not our job to try and fix applications that aren't RTL aware. That approach is doomed to failure, and just gets in the way of applications that are actually trying to do the right thing. That said, there are things we can do to improve the user-experience when working with applications that don't support RTL, but any functionality like that should be up to the user to activate.
Using terminal preview Version: 1.2.2234.0
The RTL now works! 🥳 Although there are some unsightly joining gaps... it pretty much renders correctly!
This is RTL -> عربي فارسی
But in Vim version 8.1.2269
(Ubuntu 20.04) it doesn't quite work. 😢 The RTL ordering shows up correctly, but unfortunately the joining is broken.
This is RTL in vim not joining properly -> عربي فارسی
EDIT: The joining in Vim can be fixed by either:
termbidi
to true (:set termbidi
)
arabicshape
to false (:set noarabicshape
)
It does work in nano though...
@adueck - yup, this would've been my fix in #7190. It looks to me like Vim is corntolling its own thing, I don't think there's a real issue here.
How should the command prompt be handled with Right-to-Left support?
| output |
| |
| █ </:C |
____________________________________________________________
Is something like that what would be expected. I only speak english, so have little experience of what would feel right or comfortable.
English apps will probably still output Left to Right, unless it specifically adds support for RtL.
But if it does add support, would Input be right aligned?
@adueck commented on Aug 20, 2020, 8:51 AM GMT+4:30:
It does work in nano though...
The letters are still disjoint though. Any idea why is that?
The disjoint aspect is going to be because it is splitting GlyphRuns due to scaling -- I expect that this shouldn't happen if you switch to a font that supports Arabic
I'd like to add to the discussion that it seems though the RTL renders correctly, the modification/insertion of characters doesn't seem to behave correct. Using version: 1.11.3471.0
, see the behaviour below:
It looks like the character insertion happens in reverse order. That is, inserting a character on the 3rd index (from right), then it inserts a character on the 3rd index (from left). This behaviour is consistent on Windows terminal regardless of the application open (Vim, bash, CMD, etc).
How should the command prompt be handled with Right-to-Left support?
| output | | | | █ </:C | ____________________________________________________________
Is something like that what would be expected. I only speak english, so have little experience of what would feel right or comfortable.
English apps will probably still output Left to Right, unless it specifically adds support for RtL.
But if it does add support, would Input be right aligned?
As an Arabic speaker, I would prefer when setting my terminal in an RTL mode (so everything is right aligned), to have even LTR language output to be right-aligned. This is because having to move my eyes up and down is much easier than diagonally. It is also generally good UX principles to align things vertically to make it easier to read.
So perhaps something like below would be an reasonable solution.
| output |
| |
| ls طعام*txt.bk █ <Arduino/Program Files/:C |
| طعاام.txt.bk |
| |
__________________________________________________________________________
mixing ANSI code breaks the alignment inside words
for example, take أهلا
or שלום
and color one of the letters in a different color
Related issue: https://github.com/rust-lang/rust/issues/97020
For reference, this what needs implementing it seems: https://www.unicode.org/reports/tr9/
Much of the new Terminal's codebase is directly shared with the old console codebase, and there are lots of places in that code that assume the text is all LTR. Any fixes for the Terminal would need to make sure to not break existing console behavior in this regard.
@zadjii-msft hey, would you be able to expand on this a bit (and link to other issues/responses explaining this): why has Microsoft team decided to build a newer and better terminal on the "old grounds" when they are so outdated? (not saying 'broken', they were built in another age, so they're just obsolete now). Why not start from scratch? Wouldn't that have made this a non-issue?
Well, that also assumes that there is a right way to do RTL text in terminals, and that's a notoriously unsolved issue. Even if one terminal emulator tried to fix it on their own, it's not a problem that can be solved solely by the terminal emulator - it needs cooperation from the CLI application itself, too.
Ass @egmontkob wrote here (which is possibly the definitive treatise on the topic)
With graphical applications, it’s the responsibility of one single application to do BiDi rendering, i.e. to convert the external data it handles (e.g. document, web page) along with its own UI to the pixel-by-pixel user-visible representation. In case of the terminal emulator, it’s the joint responsibility of two components: the emulator, and the application inside. The exact responsibility of each party and the interface between them needs to be well thought out.
On the bright side, by reusing the text buffer from conhost, we gain two main benefits:
What you're doing and what's happening: When writing some right-to-left characters (Farsi, Arabic, ...) after some left-to-right characters, the rtl text goes from the cursor position into the left, causing it to be mixed with the ltr text written before.
Steps to reproduce: Open a Windows Terminal window. write some english characters, like
abcdef
. Then write some Farsi/Arabic text, like:Notice that the Farsi text goes inside the english text.
Screenshot: