Maximus5 / ConEmu

Customizable Windows terminal with tabs, splits, quake-style, hotkeys and more
https://conemu.github.io/
BSD 3-Clause "New" or "Revised" License
8.59k stars 572 forks source link

Support BiDirectional text #1665

Open XVilka opened 6 years ago

XVilka commented 6 years ago

https://gist.github.com/XVilka/a0e49e1c65370ba11c17

Related to https://wpdev.uservoice.com/forums/266908-command-prompt-console-windows-subsystem-for-l/suggestions/34937857-bidirectional-text-support

Maximus5 commented 6 years ago

I know the problem, but not being any of RTL language speaker, it's hard to read and understand proper outline and direction.

It would help a lot, if you may create reliable test and the screenshots of expected and wrong output. For example, simple "hello world" from any programming language.

Also, it's not clear to me, what is expected behavior on English text containing parts of RTL. It would help if you may provide a program which may be used as etalon. I saw the Konsole in your gist, but it's rather hard to test and compare.

XVilka commented 6 years ago

OK, will do.

karliss commented 6 years ago

It seems that the way unicode describes bidirectional text printing characters in correct order requires knowing full text (chunk of text) in advance. That is not really practical for terminal. Printing characters incrementally can change positions of previously printed characters.

From the look of https://www.arabeyes.org/ArabeyesTodo#Terminal_Emulators no one really knows how bidirectional text should interact with various terminal control characters like moving/reading cursor position, clearing until end of line and others. What happens you replace character in the middle of previously printed text?

Maximus5 commented 6 years ago

Exactly. Well, terminal may treat each line of output as text chunk (which may be incorrect in editors like Vim), and use bimap from physical character index in the line with its position onscreen. But I don't know how home/end/clear must behave. Also, there may be a problem with formatting: mc, far manager, ls/dir, etc.

karliss commented 6 years ago

Turns out ECMA-48 standard has a little bit of information about how bi-directional text should interact with terminal control sequences. http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-048.pdf Haven't read it carefully so I can't comment if it's usable, implementable and how well it matches the bi-directional text logic that was added to unicode after it was written.

faridcher commented 5 years ago

bidirectional console (BiCon) might be relevant here https://github.com/behdad/bicon Konsole and mlterm are among the terminals that support BiDirectional languages. http://mlterm.sourceforge.net

XVilka commented 5 years ago

Just an update - new console BiDi specification was recently implemented in libvte by @egmontkob: https://terminal-wg.pages.freedesktop.org/bidi/implementations.html#vte

See also the issue in the new Windows Terminal https://github.com/microsoft/terminal/issues/538

ilius commented 1 year ago

Thank you for this project. It's specially nice that it can properly render Arabic words.

(when I say Arabic it also applies to Persian, Urdu and other languages that use Arabic script)

If one or more Arabic words or phrases have a different color, or uses any ANSI formatting, the order of words in that paragraph become messed up. For example if the logical order is [phrase1] [phrase2] [phrase3] and all phrases being Arabic words, then it will shows as [phrase3] [phrase2] [phrase1] which is correct because Arabic is RTL (Right-to-Left). But if phrase2 has a color/formatting, then it will be shown as [phrase1] [phrase2] [phrase3] which is Left-to-Right. This will be much worse if these phrase have multiple words, because each phrase (within the same formatting block) is still RTL, so you can't even read the whole paragraph from Left-to-Right (let alone normally) and it becomes unreadable.

I suggest you try to disable existing RTL support if any formatting exists in a paragraph, so words are shown left to right, just like Git Bash/Mintty or windows cmd.exe. This will make it much more bearable.

You can use Python to reproduce: (and compare it with Git Bash)

>>> red = f"\x1b[38;5;1m"
>>> reset = "\x1b[0;0;0m"

>>> words = [f"کلمه{i+1}" for i in range(3)]

>>> print(" ".join(words))

>>> print(words[0] + " " + red + words[1] + reset + " " + words[2])

You can see both ConEmu and Git Bash (Mintty) here:

ConEmu-arabic-bug