Receiving spam user registrations

AVS1508 commented 1 year ago

Issue Description

Dashboard is receiving many spam user registrations with email addresses that range from having no MX records to valid ones that are provisioned using some free email services (@gmail.com, @yahoo.com). It is very likely that this is some spam bot.
Most of these addresses are undeliverable (i.e. such email addresses do not exist on the domains), causing HackUMass's SES to fail with a high bounce rate when attempting to send welcome emails to such users. The SES issue will be documented separately.
Moreover, these user registrations have very inappropriate entries in their first_name and last_name fields, further bolstering the diagnosis that it is a spam bot. An example of the least inappropriate entry: ]Grow your appliance? Surprise your girlfriend!!!\r+\nhttps://vyxiz.page.link/hyxd\r.
Currently, Dashboard only performs email validation using regular expressions through Devise - a Rails authentication solution.

We can take care of these issues by employing one or more of the following methods:

[REQUIRED] Add a CAPTCHA. (Adding more details about this soon.)
Add additional email validation by checking MX records of the email address domain. A typical solution is to use valid_email2 to check for MX records and for disposable email addresses.
Add name validation using regular expressions that disallow special characters such as ?, /, :, etc. This would require documenting which characters never appear in a person's name.
Add a honey-pot trap: create a new field in the user registration form (a typical one such as middle_name) and apply CSS styles to make it invisible to a human user (avoid using display: none; though). Only bots can fill that field in, so disregard any registrations with that field filled in, and don't send emails to corresponding email addresses.

AVS1508 commented 1 year ago

Just to provide an update on the issue, the PR with the full fix (#282) has been created. Reiterating the concluding remarks,

Honeypot (Invisible Captcha) traps were also considered for resolving the spam issue, however, after implementing the above verification functionalities, we noted that any spam technology/technique that could bypass Google's Recaptcha would be sophisticated enough to overcome honeypot traps.
For our purposes, we have been receiving spam from far less sophisticated spammers: they were attempting to use our AWS SES to send emails to unsuspecting targets (target emails were filled in our email section) with malicious HTTP links inserted into their name fields (first_name and last_name), causing our system to send emails of the format Hey http://, ....
Thus, our current measures not only suffice but significantly minimize any spam emails, consequently mitigating the high bounce rate issue with our AWS SES.

AVS1508 commented 1 year ago

Here's the AWS Support Request I just submitted:

The linked attachments: 1_issue_description