mytbk / fqterm

GNU General Public License v2.0
63 stars 29 forks source link

Supporting wide characters not present in GBK character set #9

Open Elysion-tcfa opened 8 years ago

Elysion-tcfa commented 8 years ago

Currently FQTerm only supports 2-byte GBK characters and uses byte length (strlen() or string::size()) to determine the width of wide characters. However, GBK only contains a fraction of wide characters in the Unicode character set and does not contain many multilingual characters or emoji, which are gaining much popularity these years. Not supporting these characters may cause term software to draw screen incorrectly and hence lead to a display disorder. The correct solution I believe is to store characters instead of bytes, and use a function (in Linux it's built-in wcwidth(3), but not portable) to calculate the width of them. But I also believe that changing the logic may be a tough experience, which I have already had in refactoring BDWM kernel code. Maybe expanding from GBK to GB18030 without changing the underlying structure can be a little easier.

mytbk commented 6 years ago

https://github.com/mytbk/fqterm/blob/2b710519f23faa91c130d25bd5b17f4456e303c5/src/common/fqterm.h#L50-L65

It seems that after Qt 4.7, FQTerm converts characters from GB18030 to Unicode. I still don't know how FQTerm handles the characters yet. And FQTerm has its get_str_width(uint32_t) in src/utilities/fqwcwidth.cpp