Alexey-T / CudaText

Cross-platform text editor, written in Free Pascal
Mozilla Public License 2.0
2.51k stars 173 forks source link

Missing character, cp936 #2344

Closed akizuha closed 4 years ago

akizuha commented 4 years ago

a text.txt

Alexey-T commented 4 years ago

cp936 data https://github.com/python/pythontestdotnet/blob/master/www/unicode/CP936.TXT

File has bytes 61 A1 47 62. As i see two bytes A1 47 must convert to LEAD BYTE + LATIN CAPITAL G. How EmEditor shows this? it shows 2 spaces?

Alexey-T commented 4 years ago

SynWrite shows square-char. Screenshot from 2019-12-27 21-38-16

Alexey-T commented 4 years ago

Seems not bug here.

akizuha commented 4 years ago

EmEditor shows a full-width space, same as windows built-in notepad (if system is configured as cp936).

If CudaText shows nothing, how can I know there is something.

Alexey-T commented 4 years ago

@dinkumoil @vhanla Cud needs to show something here , any thoughts?

Alexey-T commented 4 years ago

https://github.com/Alexey-T/EncConv/blob/master/encconv/encconv_asiancodepages.inc this code has array cp936CC which don't have $A147 (for 2 bytes of missing char). so EncConv gives 0 for $A147 so char disappears. @vhanla Any thought?

Alexey-T commented 4 years ago

Ok, fixed in EncConv. Next version of Cud will show "a?b", ie "?" for all broken chars in DBCS codepages.