Closed ggsmith842 closed 2 weeks ago
Subdomains don’t seem to be validated.
I guess we shouldn’t use Re to validate an email address but a library like Emile.
@F-Loyer you could change the regex pattern to account for the special characters. [a-zA-Z0-9.$_!]+@[a-zA-Z0-9-]+.[a-z]{2,3} captures your email on Regex101. I added "-" to the second character set. The example uses a pretty simple pattern but part of why I included the second pattern is to show how you can update the regex pattern and use it without needing to change any other code.
I think using Re is still a valid way to verify email patterns. It would be cool to see how Emile can be used as well! You should add a recipe for that.
With simple regex, we can fix something, but it can still remain broken. Forgetten « - ». Forgotten subdomains, 3 characters maximum top level domain (sponsored tld can have more)…
From http://emailregex.com:
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
(Directly from RFC5322). I will try to use it in an ocaml Re library. Note: it is far shorter than a Re directly derived from RFC822. The emailregex page proposes shorter but inaccurate regex.
A bit tricky: Re doesn’t support \xNN escaping. But Ocaml strings does.
Then we should use:
(* RFC5322 regular expression, adapted from http://emailregex.com *)
let validate_email_re =
Re.Perl.re "(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\\\[\x01-\x09\x0b\x0c\x0e-\x7f])*\")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\\])"
|> Re.no_case
|> Re.compile
EDIT: the regex find an email even if prefixed by a garbage. A « ^…..$ » may be better to validate an address.
With let () =
, we should use Array.iter
, not Array.map
.
Validate an email address using a simple email pattern. Examples included show validation with both the
Str
andRe
libraries.