HaveIBeenPwned / EmailAddressExtractor

A project to rapidly extract all email addresses from any files in a given path
BSD 3-Clause "New" or "Revised" License
64 stars 23 forks source link

Handled cases for initial chars #38

Closed hiteshbedre closed 1 year ago

hiteshbedre commented 1 year ago

Fixes: https://github.com/HaveIBeenPwned/EmailAddressExtractor/issues/36

hiteshbedre commented 1 year ago

Here is a playground of email cases handled: https://regex101.com/r/TjPEI6/1

KonajuGames commented 1 year ago

I think we have both done the same. My PR #39 also fixed those and the other remaining tests with the exception of the emoji and IDN domain name tests.

hiteshbedre commented 1 year ago

@KonajuGames yeah, we solved the same issue.

hiteshbedre commented 1 year ago

Just difference in approach is, @KonajuGames solved by handling cases via code and me via regex.

KonajuGames commented 1 year ago

Just difference in approach is, @KonajuGames solved by handling cases via code and me via regex.

I extended the regex to be more generous in what it matches, to include opening and closing quotes, then filter them out in code.

hiteshbedre commented 1 year ago

closing quotes

hitesh"@abc.com is valid scenario.