JoshData / python-email-validator

A robust email syntax and deliverability validation library for Python.
The Unlicense
1.14k stars 112 forks source link

Validating an email with a bad punicode domain is reported as an internal UnicodeError instead of a EmailSyntaxError #120

Closed Ronserruya closed 12 months ago

Ronserruya commented 1 year ago

When trying to parse domains with a bad punicode (at least as far as I can understand the error) The pkg fails to catch the underlying UnicodeError, raising it instead of wrapping it in a more understandable EmailSyntaxError

validate_email("ron@xn--ram--t-k1.com", check_deliverability=False)
# UnicodeError: decoding with 'punycode' codec failed (UnicodeError: incomplete punicode string)

This is the relevant code: https://github.com/JoshData/python-email-validator/blob/main/email_validator/syntax.py#L434

try:
    domain_i18n = idna.decode(ascii_domain.encode('ascii'))
except idna.IDNAError as e:
    raise EmailSyntaxError(f"The part after the @-sign is not valid IDNA ({e}).")

I guess a better error would be EmailSyntaxError("The part after the @-sign contains invalid punicode: ({e})")

JoshData commented 1 year ago

Thanks for reporting the problem! I think this was actually a bug in the idna package that was fixed in version 3.3:

Throw IDNAError exception correctly for some malformed input

I get the error you see when I downgrade idna to version 3.2 (pip install idna==3.2) but not after version 3.3 (pip install idna==3.3. So updating idna probably will fix it.