JoshData / python-email-validator

A robust email syntax and deliverability validation library for Python.
The Unlicense
1.16k stars 112 forks source link

Unicode surrogates in local part #72

Closed Pentusha closed 2 years ago

Pentusha commented 2 years ago

This email has to be invalid: \udcff@example.com. This validation has already been implemented for the domain part.

JoshData commented 2 years ago

How do you know it is supposed to be invalid?

Pentusha commented 2 years ago

This is a reasonable question. Some other validation libraries mark this kind of email as invalid, so I decided that is the correct behavior.

JoshData commented 2 years ago

It's a good question. There may be a bigger class of Unicode strings that may be unsafe in some manner. For example, recently this: https://krebsonsecurity.com/2021/11/trojan-source-bug-threatens-the-security-of-all-code/. For the use case for this library, I think we can take a fairly strict approach and try to ensure that the local part of the address is safely displayable.

JoshData commented 2 years ago

Fixed by df852f7e380745b5d9612c274a9de4d8cd641c79 and b255f9eb80b1fbe956eeba72e0b4e4fa48c7dfd8. Thanks!