Open byundojin opened 1 month ago
This issue is more complex than it looked.
This assertion is not simply redundant, it is wrong. It is not normally triggered because this code is not normally executed. Actually, I do not know how to trigger this situation. But if comment out && n < wlen
, the assertion will fail when write non-ASCII data.
Using WideCharToMultiByte()
to calculate the offset in bytes from the number of wchars that was written is not correct either. Invalid UTF-8 byte produces \ufffd which will be converted back to 3-bytes UTF-8 sequence \xef\xbf\xbd. So you can get wrong len
, even larger than the initial length.
There are other bugs here, for example, no MemoryError is raised when PyMem_Malloc()
or other memory allocation functions fail.
WriteConsoleW()
writing less characters than provided?Note also that you should run the new test without using libregrtest, because it is guarded by non-standard resource "console":
python.bat -m test.test_winconsoleio -v
Feature or enhancement
Proposal:
If function
_io__WindowsConsoleIO_write_impl
does not produce as expected output from WriteConsoleW, it is the process of recalculating the existing len.At this point, function WideCharToMultiByte finds the byte length of utf-8 and function MultiByteToWideChar finds the letter length of utf-16. And is this comparison valid for comparing the results of these two?
Has this already been discussed elsewhere?
I have already discussed this feature proposal on Discourse
Links to previous discussion of this feature:
https://discuss.python.org/t/is-the-size-comparison-of-utf-8-and-utf-16-valid/63614
Linked PRs