HaveIBeenPwned / EmailAddressExtractor

A project to rapidly extract all email addresses from any files in a given path
BSD 3-Clause "New" or "Revised" License
64 stars 23 forks source link

optimised multiple check in a single pattern + handled some check around underscore and single quote #34

Closed hiteshbedre closed 1 year ago

hiteshbedre commented 1 year ago

Fixes: https://github.com/HaveIBeenPwned/EmailAddressExtractor/issues/5

Sample is valid?
foobar@c_m.com Yes
foobar@_.com No
hiteshbedre commented 1 year ago

Fixes: https://github.com/HaveIBeenPwned/EmailAddressExtractor/issues/33

GStefanowich commented 1 year ago

You should add this as another GeneratedRegex

[GeneratedRegex(@"[a-z0-9\.\-!#$%&'+/=?^_`{|}~""\\]+(?<!\.)@([a-z0-9\-_]+\.)+[a-z0-9]{2,}\b(?<!\s)", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled)]
public static partial Regex EmailRegex();

[GeneratedRegex(@"\.\.|\*|\.@|^\.|@-", RegexOptions.Compiled)]
public static partial Regex InvalidRegex();
hiteshbedre commented 1 year ago

@GStefanowich I will accommodate the suggestion in upcoming pull request.

GStefanowich commented 1 year ago

@hiteshbedre I made the change already in 99a360c, so it's all set