Open litlighilit opened 5 days ago
I'm going to fix it until I realize something serious to communicate: how errors shall be handled within error_handle?
One is simply re-call encode
with 'strict' errors against the returned str
, which,
however, may lead to dead loop (recursion), because encode
will still invoke error_handler against the same data of str
is a method also used by _multibytecodec
module.
(EDIT:using strict as fallback won's cause deadloop)
I'll have a look at it in a few days. I think error handler should not have errors but I need to verify. If the error handler errs, then the exception should be propagated.
I'll have a look at it in a few days. I think error handler should not have errors but I need to verify. If the error handler errs, then the exception should be propagated.
Right, and it in fact does as expected, as edited above, my previous option was incorrect.
Then let's return to the original issue:
I've figured out where the mistaken code lies, and there're at least three places required to be fixed.
Yet these two days I'm too busy to focus on this patch, sorry in advance but I'll make it this weekday.
Bug report
Bug description:
For
codecs.encode
, withutf-*
encoding, and a customerrors
which returnsstr
, if you pass some characters that are not invalid UTF characters (e.g. surrogates),UnicodeEncodeError
is just raised and there's not the expected (and documented) case where the returnedstr
is appended.Output:
CPython versions tested on:
3.9, 3.11, 3.12, 3.13, 3.14
Operating systems tested on:
Linux, Windows