Closed callahad closed 3 years ago
Can we have a simple heuristic to detect possibly bogus emails, and have some JavaScript that intercepts the form submission and asks if they're sure?
Possible heuristics:
Username or domain or extension is 1 character 3+ repeated characters Qwerty runs
Since we don't control the submission form (in the general case), we'd have to block this at the broker (backend)... but we could do that. And it'd be more robust, too. A reasonable UX seems obvious: when we detect a suspicious email address, we could just display an error along the lines of "sorry, this doesn't look like a real address" and then either hard-fail or require a CAPTCHA before continuing.
...the trick is figuring out those heuristics, and making decisions about when it's OK to hard-fail. The ones you suggested seem sane for soft-failures. We could also check DNS for an MX record before proceeding, and hard fail if one isn't found, though test.com
does publish MX records :/
There's a bunch of stuff we can do. Ideas:
Suggestion: treat a hn referer as probable bogus. Use captcha, or fall back to SendGrid specifically for probable bogus users.
Sendgrid has an email validation API which should filter out really obviously bad stuff.
If you're checking MX for the domain you should check also A record too as a fallback if MX does not exist. I had issues with this before I found out that the standard allows this (if no MX, use A record to deliver mail)
We could probably check at least the domain componment is valid today, and back it up with an MX check:
$ curl -H 'Content-Type: text/plain; charset=utf-8' -d wibble+chEESe@.example2..com https://broker.portier.io/normalize
wibble+cheese@.example2..com
Also it should be noted that the username part of an email address is case-sensitive and /normalize
should not really be massaging this bit.
Sorry I cannot provide a PR for this, 99 languages and Rust ain't one for me. :)
If I remember correctly at that point in time the DNS/MX check was not easy to do with Rust. See https://github.com/portier/portier-broker/pull/97, we finally gave up on that. Yt might be possible today though.
For the case sensitivity, see https://stackoverflow.com/questions/9807909/are-email-addresses-case-sensitive#9808332. In practice they are case insensitive, and we would run potentially into issues when not treating smith@gmail.com
as Smith@gmail.com
.
FYI 'Null MX' records caught me out the other day for my own email validation: https://tools.ietf.org/html/rfc7505
On this topic, it would be worth supporting domain blacklists users can import, such as : https://github.com/ivolo/disposable-email-domains
We now do rate limiting and DNS checks, plus have the option of blocking specific domains. I'm going to close this, because for now I think we've made a good effort towards the original goal of this issue. If more is needed, we'll revisit.
I'll create specific tickets for:
Postmark is super fantastic and speedy as hell, but as soon as we hit the frontpage of HN people started entering fake addresses like
a@a.com
andtest@test.com
which bounced. Because of the rate of bounces, Postmark suspended our account.Right now we've fallen back to SendGrid, which is taking upwards of 60 seconds to deliver mail, totally unsuitable for Portier's user. I've set up SPF and DKIM records for Mailgun, so we can try jumping over there in an hour or so, but... this seems like we're on a really shaky foundation.
We don't want to screw over the email service providers, but I also don't know how we could further reduce our bounce rate. Any ideas?