HaveIBeenPwned / EmailAddressExtractor

A project to rapidly extract all email addresses from any files in a given path
BSD 3-Clause "New" or "Revised" License
64 stars 23 forks source link

Support for IDN domain names #23

Open yusefren opened 1 year ago

yusefren commented 1 year ago

IDN domain names should be supported i❤️.ws is valid and working domain بريد@موقع.شبكة should be valid email address

troyhunt commented 1 year ago

Hey @yusefren, thanks very much for adding this. I've just committed 2 failing tests for this in 6944aa836de4e24c34729b658888b5ca425b5f7f.

That said (and for anyone that wants to take a stab at implementing this), we're talking about extremely rare edge cases here in terms of the email addresses I see in data breaches. I suspect that anyone attempting to use one of these addresses on most sites would find them rejected and I'd be surprised if they were in broad use. By all means, implement this idea just make sure it doesn't hit the performance of parsing out the most common stuff.