Error in email module exception handling

naga0001 commented 8 months ago

https://github.com/python/cpython/blob/d0c32ae419316cb0d6f06ec8cb2f6b91a878070f/Lib/email/_header_value_parser.py#L782-L784

This is incorrect. except (LookupError, UnicodeEncodeError):

Isn't this correct? except (LookupError, UnicodeDecodeError):

pochmann3 commented 8 months ago

Why do you say it's incorrect? Because there's only a decode call? I can reproduce a UnicodeEncodeError with a decode:

Traceback (most recent call last):
  File "/ATO/code", line 1, in <module>
    b''.decode('\udce2')
UnicodeEncodeError: 'utf-8' codec can't encode character '\udce2' in position 0: surrogates not allowed

Got that from the commit that added the UnicodeEncodeError catch. And the issue also showed a UnicodeEncodeError.

I tried your change anyway, and it didn't make a test fail. Maybe a test should be added for that? (Maybe one was added but doesn't actually test it?)

naga0001 commented 8 months ago

Sorry, I thought it was UnicodeDecodeError because it is simply decode. Was UnicodeDecodeError exception handling unnecessary for this process?

pochmann3 commented 8 months ago

I think the surrogateescape error handler indeed makes other exception handling unnecessary, but I'm no expert. You could also try asking on Discourse.

python / cpython

Error in email module exception handling #116705