scandum / tintin

TinTin++, aka tt++, is an extensible console MUD client.
https://tintin.mudhalla.net
GNU General Public License v3.0
204 stars 56 forks source link

BIG5TOUTF8 character display error #39

Closed rarealphacat closed 3 years ago

rarealphacat commented 3 years ago

BIG5TOUTF8 such as "一" "世" "護" are having display problems. Their corresponding big5 codes are A440 A540 & C540 Screenshot_2021-04-02_00-04-23

scandum commented 3 years ago

Thanks, might be a while before I get around to this.

Would be helpful if anyone can confirm this, because I'm not a fan of blindly changing things. Could it also be a server side problem?

rarealphacat commented 3 years ago

Thank you. I've just tried with two another BIG5 mud and they have the same display problem, so I'd suggest it is not a server side problem or it is their common server setting preferences. BTW, I am using 2.02.05.

I've also tried with encoding BIG-5 instead and it shows another type of display problems.

charset BIG5TOUTF8 Screenshot_2021-04-02_12-11-42


charset BIG-5 Screenshot_2021-04-02_12-10-52

Notice how the BIG-5 shows "一" "世" "護" without problem but both charset configs are unable to handle the final line correctly.

*edit - sorry idk how to fix the bold XD

scandum commented 3 years ago

Judging from the first screenshot it looks like the problem also manifests when translating BIG5 to UTF8.

Could you provide me with the raw BIG5 string that is causing the display problems before (Board)?

rarealphacat commented 3 years ago

The string before (Board) is "τν交流版τν(Board) [ 30 張留言 ]" and the characters "tv" does not seems to be regular English characters, they should be some sort of special symbols (Greek?). Screenshot_2021-04-04_22-38-26

So it shouldn't be the BIG5 string that caused the display problem there as it can be displayed properly when I repeat that with "say". Anyway, 交流版 should turns into "A5E6" "AC79" "AAA9" correspondingly. Screenshot_2021-04-04_22-39-10

scandum commented 3 years ago

Thanks, could you tell me the sequences preceding (Board) as displayed when #config convert on has been enabled?

rarealphacat commented 3 years ago

Hi, Its as below \e[0m\e[0;1;32m廣場\e[0m - (\e[0;1;32mboard\e[0m) - \e[0m \e[0m 這裡是白石村落中唯�@的廣場,同時也是村落中最熱鬧的所在。好奇地\e[0m \e[0m四處張望,不但可以發現地面上的小石磚�@塊塊整齊地排成�@彎的圓形圖案\e[0m \e[0m,還可以看見廣場的周圍分別豎立著幾尊手工精緻的天使雕像.據說這些雕\e[0m \e[0m像都象徵著「守�@」的意涵,讓村落能永恆地受到來自天使們的守�@。此時\e[0m \e[0m來自遙遠的涼爽海風不斷地從廣場的南方吹來,讓過往的旅人都感覺十分的\e[0m \e[0m舒服之外,還能在此欣賞夏日南方那美麗的海景,在視覺上都能感受到�@種\e[0m \e[0m超然脫俗的感受,使得路經此地的行人都不由自主的想多停留�@會。廣場的\e[0m \e[0m中央豎立著�@塊巨大的交流版(board ),是�@處能讓來往的旅客們能在此\e[0m \e[0m留下個人言論意見的�@個場所,而版子的下方也能發現到有白石村落的�@些\e[0m \e[0m公告(sign)。\e[0m \e[0m 太陽斜掛在西方的天空中。\e[0m \e[0m 這裡明顯的出口是 \e[1msouth\e[2;37;0m 和 \e[1mnorth\e[2;37;0m。\e[0m \e[0m \e[1;37mξ\e[30m獨弦\e[37m哀歌\e[1;37mξ\e[2;37;0m(Emil)\e[1;35m <發呆中>\e[2;37;0m\e[0m \e[0m 「\e[37mSCP-CN-1999\e[2;37;0m」說好不掉節操(Tis)\e[1;35m <發呆中>\e[2;37;0m\e[0m \e[0m \e[1;36m耶\e[1;37m耶\e[2;37;0m(Arape)\e[1;35m <發呆中>\e[2;37;0m\e[0m \e[0m 「\e[1;37m哞哞哞哞哞哞哞\e[2;37;0m」\e[1;36m理姿\e[2;37;0m(Arliz)\e[1;35m <發呆中>\e[2;37;0m\e[0m \e[0m 「\e[1;33m 66罷韓光復高雄\e[2;37;0m」\e[1;33mE\e[1;34m冰心\e[1;33m-▽Φ\e[2;37;0m(Icyheart)\e[1;35m <發呆中>\e[2;37;0m\e[0m \e[0m 「line:shadow19850421\e[2;37;0m」\e[1;37mξ\e[30m亡\e[2;37;0m\e[37m魂歌\e[1;37m\e[30m聲\e[1;37mξ\e[2;37;0m(Lunia)\e[1;35m <發呆中>\e[2;37;0m\e[0 m \e[0m 「\e[1;31m七夕愛戀\e[2;37;0m」七夕愛戀天使(Seven night angel)\e[0m \e[0m \e[0;1;34m�\e[1mn\e[1;36m�\e[1mh\e[1;37m�\e[1m�\e[1m�\e[1my\e[1m�\e[1m�\e[1;36m�\e[1mn\e[1;34m�\e[1mh\e[0m(Board) [ 30 張留言 ]\e[0m \e[0m傍晚了, 太陽的餘暉將西方的天空染成�@片火紅。\e[0m

XMLSDK commented 3 years ago

No display problem for me using:

scandum commented 3 years ago

I took a closer look at the utf8 / big5 translation table and everything seems to check out. Maybe double check if the problem is fixed in 2.02.11, since XMLSDK has no issues.

As for the other displaying issue, looks like a bunch of color codes are mixed in. Is it possible they're trying to display a chinese character with the left side a different color than the right side?

rarealphacat commented 3 years ago

2.02.11 seems to fixed the first displaying problem.

Screenshot_2021-05-10_10-06-26 Screenshot_2021-05-10_10-04-46

The second display problem with #config convert on looks like this

\e[0m \e[0;1;34m�\e[1mn\e[1;36m�\e[1mh\e[1;37m�\e[1m�\e[1m�\e[1my\e[1m� \e[1m�\e[1;36m�\e[1mn\e[1;34m�\e[1mh\e[0m(Board) [ 17 張留言 ]\e[0m

It does not look like they're trying to display 2 colors for the same character. But afaik those ain't regular Chinese character, they are other some other special symbol (maybe they are not in the library?). I will get back to you if i can find the corresponding BIG5 code and thank you for the version update :bow:

scandum commented 3 years ago

In that case I assume the characters aren't in the translation table. If you get me the big5 character and corresponding unicode character I can check, and add them if they're missing.

rarealphacat commented 3 years ago

The corresponding codes appears to be as below:

Big5 τ A36E ν A36F Unicode τ 03C4 ν 03C5 reference Can you check if they are missing? If so, would be great if you can also add the related :D

edit I also see these wandering around in game

<118>big5_to_utf8: did not find big5 index '0xf9fc' <118>big5_to_utf8: did not find big5 index '0xf9fd'
scandum commented 3 years ago

I added these definitions to the table:

index decimal    hex  unicode
-------------------------------
18924   63994  f9-fa     9581 ╭
18925   63995  f9-fb     9582 ╮
18926   63996  f9-fc     9584 ╰
18927   63997  f9-fd     9583 ╯

τ and ν are in the table, so I suspect the issue lies elsewhere. Having the raw data as sent by the MUD would be helpful.

My best guess is that it's mixing color codes with big5 to display a chinese character with each half having a different color. Maybe ask the MUD to fix it if that's the case?

rarealphacat commented 3 years ago

I will try and contact the maintainer. Closing with comment as there's not much we can do from here.