FarGroup / FarManager

File and Archive Manager
https://farmanager.com
BSD 3-Clause "New" or "Revised" License
1.78k stars 199 forks source link

Far x86 (5803) Windows XP SP3 - artifacts when displaying japanese filenames #410

Open de77 opened 3 years ago

de77 commented 3 years ago

When displaying long japanese filenames looks like Far not correctly calculating length of their names and pseudo-graphic separators are placed in wrong places. I took screenshot for cases like this, see https://rapidshare.io/U2k/far_artefacts.bmp. It is not new issue - same artifacts i saw in old Far 2.0.1807.

de77 commented 3 years ago

Looks like miscalculation happens when selection is no set on such file. When such file is selected separators placed where they should be. In the lower panel, where additional information such as size of file, creation date and time is displayed, sometimes instead of kanji or kana displayed empty boxes. They are become kanji and kana again if you minimize Far to tray and then maximize back.

de77 commented 3 years ago

In Far3.0.5809 x86 same behavior.

alabuzhev commented 3 years ago

It's such a long, complex, interesting and sad story. I better not write it down here. Maybe another time.

5809 made a few baby steps in that direction. Enable [x] Fullwidth-aware rendering in Interface settings to see the difference.

I'm aware that it doesn't cover everything, that selection is all wrong etc. Bug reports aren't welcome yet. Maybe I'll continue when I have time and inspiration (and forget all the sadness).

If you want something acceptable for daily usage - consider ConEmu, it does amazing job here.

de77 commented 3 years ago

Thanks. I enabled Fullwidth-aware rendering - now there is no artifacts on the screen and separators displayed where is should be, but... all the kana and kanji doubled. I.e. file "[R4F] 真宵が音楽に合わせて阿良々木さんの名前を呼ぶ動画 音MAD.mp4" displayed as "[R4F] 真真宵宵がが音音楽楽にに合合わわせせてて阿阿良良々々木木ささんんのの名名前前をを呼呼ぶぶ動動画画 音音MAD.mp4". When i select that name and do Ctrl+Ins then in clipboard i geting correct name without any doubled symbols. It's almost done.

Bug reports aren't welcome yet.

Ok, should i close this issue for now?

alabuzhev commented 3 years ago

all the kana and kanji doubled

Told you, it's a sad story, but whatever.

Internally we don't print single strings, but rectangular blocks of text, or 2D matrices of characters. "Rectangular" implies that the number of characters in every row is the same. Obviously, it can't be the same if some characters have double width and occupy two cells. And here comes the clever bit - the "official" (and barely documented) way is doubling such characters and marking the first one as COMMON_LVB_LEADING_BYTE and the second one as COMMON_LVB_TRAILING_BYTE. This way the matrix remains rectangular, but when the console drawing system encounters such combination, it draws the first character as usual using two cells and then completely skips the second one, thus restoring the balance. And this is what 5809 does.

However, this witchcraft only works when the stars align properly: you might need a CJK codepage in your console (e.g. chcp 932), which might require a CJK system locale (e.g. Japanese). It can probably be affected by the font as well. These requirements seem to be somewhat relaxed in Windows 10 and the situation is better there, and even better in the new Terminal.

What locale, codepage and font are you using?

Ok, should i close this issue for now?

Let's keep it open for a while. "Aren't welcome" means that, in general, any issue in this area is not a minor overlook that is easy to fix, but a rabbit hole.

de77 commented 3 years ago

What locale, codepage and font are you using?

chcp says that there is 866 codepage in console is set. In Far 1251 codepage is set as ANSI and 866 is set as OEM. In terminal font Lucida Console is used. Support for Japanese language is installed.

I tried to execute command in console "chcp 932" under Far instance and got message "Указана недопустимая кодовая страница".

I will try to evaluate ConEmu as you suggested.

Thank you for your hard work.

OCTAGRAM commented 3 years ago

What locale, codepage and font are you using?

Windows 10 changed the way locale is set up. It is not a single setting now. Instead, it is a list of preferences. But programs need to use new WinAPI to fetch this list, and old WinAPI returns something else, something ridiculous, which may not reflect the user's intent.

In my experience the locale of console can be a well hidden setting that is called "Codepage for programs not supporting Unicode". I had a program that is indeed a Unicode one, and anyway I have a checkbox ANSI=UTF-8 enabled. So I did not bother much about this setting. But when some program was making a request for currently active locale, wrong one was selected:

https://github.com/HeidiSQL/HeidiSQL/issues/1271

alabuzhev commented 3 years ago

@OCTAGRAM, Windows 10 changed a lot of things. Not all for the best, but in this area, as I mentioned earlier, the situation is much better now: conhost is fullwidth-aware by default in any locale, you only need to select a proper font (or change the console codepage to pick a matching font automatically, e.g. chcp 932). And in Windows Terminal, despite all the other broken things, it just works out of the box, which is awesome.

The question was about Windows XP, where drawing fullwidth as fullwidth is impossible in non-East Asian locales.

de77 commented 3 years ago

I tested version 3.0.5814 - there is no visible changes from previous version when turning fullwidth-aware mode on and off. Perhabs these screenshots will be somehow useful (they were made on Far version 3.0.5810, but i rechecked for Far version 3.0.5814 - the behavior has not changed). See https://rapidshare.io/Ubv/Far.zip

alabuzhev commented 3 years ago

As I said earlier, you won't see any improvements in XP unless you have an East Asian locale.

OCTAGRAM commented 3 years ago

@de77 What is the current setting of "Codepage for non-Unicode programs"?

Wondering if that is enough to make CJK locale

de77 commented 3 years ago

@OCTAGRAM, thanks. I tried to do as you suggested, the behavior changed but still no luck. Now chcp running under Far session says 932 encodings are set. I can still see artifacts, but now the "Fullwidth-aware rendering" works differently. See https://rapidshare.io/Uhx/Far.zip