Open tjensen opened 3 months ago
This limitation on Windows is because the error handler of the filesystem encoding is required to be "surrogatepass" instead of "surrogateescape". In principle, builtin nt._path_splitroot_ex()
could handle UnicodeDecodeError
, or any other ValueError
, by using the C API to call ntpath._splitroot_fallback()
. This would require enabling the suppress_value_error
option of the path_t
argument converter.
the error handler of the filesystem encoding is required to be "surrogatepass" instead of "surrogateescape".
Why have we never noticed this before? We can just fix that, I believe - the filesystem encoding on Windows is just a compatibility hack to support POSIX developers (I'm pretty sure I wrote something to that effect in PEP 528 or 529 or whichever one it was).
Bug report
Bug description:
The
ntpath.splitroot
function appears to have changed in Python 3.13 such that it now raises aUnicodeDecodeError
when the given pathname is abytes
containing invalid Unicode characters, but only when running on Windows:The same code works without raising on Windows when using Python 3.12:
The same code also works without raising on Linux when using Python 3.13 or 3.12:
CPython versions tested on:
3.12, 3.13
Operating systems tested on:
Linux, Windows