ParabolInc / parabol

Free online agile retrospective meeting tool
https://www.parabol.co/
Other
1.87k stars 326 forks source link

Improve email regex for account creation #9929

Open mattkrick opened 3 days ago

mattkrick commented 3 days ago

We are very permissive with our email regex. This is a problem because from the email, we derive the User.domain. From User.domain we derive Organization.activeDomain.

In the migrations, I've found activeDomains that are hundreds of chars long 😬

The best thing to do is to enforce email verification on signup, but I got too much pushback from that. The next best thing we can do is fix the regex.

I propose the following:

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

To see the junk we have (plus all the deleted accounts), you can do the following:

select * from "User"
WHERE email !~ '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'

AC

Dschoordsch commented 3 days ago

We have a couple of valid looking emails containing '. This seems to be allowed but also appears to be the only legitimately used special character missing.

mattkrick commented 3 days ago

i saw those! they look like security researchers. while it's technically allowed, i think it's safe to say no tlds that we care to accept end in a '.

basically, email specs are a mess. check out https://www.rfc-editor.org/rfc/rfc5321#section-2.4. it says email addresses are case sensitive 🤯

Dschoordsch commented 2 days ago

Yes, it's a mess and most email providers will only allow a subset. But I saw legit uses of ' in emails for names like O'Ferrall. So that's the one character I would allow from the list of special characters. All the rest seem to be security researchers like you said.