rust9x / rust

Empowering everyone to build reliable and efficient software, even for Windows 9x/Me/NT/2000/XP/Vista.
https://github.com/rust9x/rust/wiki
Other
319 stars 9 forks source link

9x/ME: Implement proper handling for console writes where character lengths differ between utf16 and target codepage #14

Open seritools opened 6 months ago

seritools commented 6 months ago

For #13 I have implemented a kind of crappy workaround. This ticket tracks actually implementing a proper solution.

The workaround: https://github.com/rust9x/rust/blob/cdf0f735b82392dbb7b4cba3ace3dcbc909ec229/library/std/src/sys/windows/stdio.rs#L205-L209

Problem description

Whenever the "character count" mismatches between the UTF-16 side and the codepage size, the "amount of characters written" also mismatches between both sides.

Solution idea

Roguh idea for the conversion code

Becasue of the multiple, lossy conversions needed on 9x/Me (subset of UTF8 → WTF16 → codepage), the only sensible way would be to do the code page conversion in the stdlib code, then loop until the entire converted buffer has been written, thus confirming that the entire input buffer of up to MAX_BUFFER_SIZE bytes have been written. Only that way we can ensure that all input characters are accounted for in some capacity.