Contact form spam - Githubissues

nelsonic commented 6 years ago

We get at least one message like this per day: I usually just delete it but it takes a few seconds of my time to "pattern match" it as SPAM. And since the contact form sends an email to hello@dwyl.com it ends up in the "inbox" of 3 people and in our Google Spreadsheet, so it needs to be manually deleted from there too. Overall I'd say that each spam email "costs" us about a minute of time because we don't have the Google Spreadsheet open, so we have to load it in the browser, find the relevant/offending row in the spreadsheet and delete it.

Those minutes add up. 🙄

nelsonic commented 6 years ago

This one appeared last night:

Google should have "flagged" this as Spam, but it can't because it comes from a "trusted" source... 😞

naazy commented 6 years ago

https://ondmarc.com Made by redsift (@jrans) fixes exactly this I believe

jrans commented 6 years ago

Not quite @naazy ! It's only the contact form which should always hit the mailbox, albeit allows for misuse. Spam filters at some point in chain advised.

Though if interested in preventing email impersonation of your own domain, to protect clients and suppliers then DMARC is needed and also improves deliverability of your emails. If you're not already set up DWYL please try the tool when you have time as theres a free 2 week trial and I'll imagine you'll stay on the free tier or have got the picture of what you need to do.

naazy commented 6 years ago

@jrans Sorry that was unclear! I was referring do the email impersonation (because they are receiving emails from hello@dwyl.com)

jrans commented 6 years ago

I get you, just wondering though if the emails are sent via a crawler using the websites contact form and hence the email sender is indeed very much true, adding some form of validation to the contact form i.e. message.indexOf('penis') === -1 client or server side before email flow is probably worth it

rub1e commented 6 years ago

This is doing my head in now - I shall add it to my list

I'm thinking at the very least, we should implement a lo-fi honeypot solution like adding in a hidden form field and checking whether it's been filled in

Forgive my ignorance, but can we handle that ourselves, or does it require modifying the contact form script?

rub1e commented 6 years ago

Oh hang on - there already is a honeypot - https://github.com/dwyl/learn-to-send-email-via-google-script-html-no-server/blob/master/form-submission-handler.js#L7

  function validateHuman(honeypot) {
    if (honeypot) {  //if hidden form filled up
      console.log("Robot Detected!");
      return true;
    } else {
      console.log("Welcome Human!");
    }
  }

nelsonic commented 6 years ago

This looks like it was submitted by a human... 🙄

rub1e commented 6 years ago

OK I think I've got to the bottom of this (with @Cleop 's help)

The answer seems to be that we don't use the form submission handler for the general contact form - we only use it for the time beta - so the form is just posted to the google apps script.

So the solution would be for me to reproduce this handler for the general contact form, and activate the honeypot function.

Does that sound correct?

nelsonic commented 6 years ago

@rub1e that would be the interim solution, yes. 👍

rub1e commented 6 years ago

@sophielevens assigning to you because you're a professional developer 😀

Should be fairly straightforward: there's a submission handler for the beta testing form (#appform), and it needs reproducing for the normal contact form (#gform) so that one is handled as well.

There is also a line which needs un-commenting - which activates the honeypot 🍯 function:

https://github.com/dwyl/dwyl-site/blob/master/js/app_google_form_handler.js#L44

In my head it's as simple as duplicating js/app_google_form_handler.js, and putting in gform instead of appform 🍰

(Which is why it's best that you deal with this 😉 )

rub1e commented 6 years ago

Having updated the form submission handler between myself and Sophie (#457), I've now seen two new spam messages which confirm that my "fix" works only to the extent that spam messages are now submitted with the text "honeypot" prepended

Ultimately I'd say that's not quite job done....

Will try to take another look at it this week

nelsonic commented 6 years ago

@rub1e indeed the "honeypot" is "working", thanks to @sophielevens + @mckennapsean ❤️

However, we need a "nuclear" approach to all kinds of "SPAM" 👾 because we are about to get a huge influx of the stuff! see: https://github.com/dwyl/feedback/issues/96 🙄

If you want to discuss further, e.g: you become a "PM" for a fully-fledge software product, 💡 please let us know and we will setup a call [e.g. next week when things are less hectic!] to discuss. 📆 Thanks! ✨

rub1e commented 6 years ago

I think I see what you mean @nelsonic - the bots just aren't filling in the honeypot field. They're learning

Re PMing - let's chat. This sounds good, but I've also been thinking a bit about making a Meteor app prototype to deal with some of our invoicing/finance issues e.g. https://github.com/dwyl/hq/issues/382 (possibly even https://github.com/dwyl/hq/issues/373)

nelsonic commented 6 years ago

@rub1e invoicing can only be solved if we "crack" automatic/effortless time tracking against client work and thus automatic invoicing is a [lovely/painless] "by product". 😉

It's worth exploring this in detail, agreed and even considering building a "low-fi" prototype. However we could equally chose to re-focus our entire company on "Products" of which "effortless task & time tracking" was always meant to be the first Product: https://github.com/dwyl/start-here#what

We are finally making in-roads into building our Products with Phoenix (PETE) using well-regarded "best practices" for application design/architecture for reliability/fault-tollerance, accountability, distributed systems, speed of iteration [feature dev time!] and developer happiness.

If you are interested in building any prototypes with (PETE) please consider reading/following our ever-expanding beginner-focussed Elixir, Phoenix, Elm & Co tutorials! 😉

BTW: stoked that you are keen on tackling one of the core/fundamental "challenges" we have in @dwyl! Thank You! ✨

rub1e commented 6 years ago

For some time now, Nelson, I've been aware of the Damoclean Sword of PETE dangling above my head

I think I'm just going to have to bite the bullet and learn the new stack so I can contribute 😨

I'm as excited as I am afraid

mckennapsean commented 6 years ago

People have successfully used recaptcha with our gform approach, and it may be simpler to integrate one of these than build a custom in-house solution. Spam bots can get smart fast. 😉

nelsonic commented 6 years ago

@rub1e https://en.wikipedia.org/wiki/Damocles indeed. PETE will set you free! Imagine a world in which making changes/updates/improvements to your app cannot "break" the existing functionality because the compiler won't let you ship the app if anything is "out of place"! That's Elm. And the developer experience is glorious! 😍

Imagine being able to "replay" the [anonymised] user interaction in your "Killer App" ⚽️📱 to know where people are encountering issues with the UX and thus retention is suffering ... 📉

We are building a much more insightful way of creating apps with tight feedback loops. ♻️

Imagine having all the power of MixPanel automatically baked into every feature of your app ... https://youtu.be/MABmQhOlmJA You and your team can be "data driven" for every decision and build features faster and without waste. https://en.wikipedia.org/wiki/Muda_(Japanese_term)

@mckennapsean agreed the reCAPTCHA approach is certainly a good one to use with a gform. 👍

As you rightly say, spam bots searching for "unprotected" forms "learn" to "defeat" captchas fast: https://medium.com/@ageitgey/how-to-break-a-captcha-system-in-15-minutes-with-machine-learning-dbebb035a710

Consider the fact that Google is no longer using Captchas on their consumer-facing services ... e.g: Creating a Google Account: https://accounts.google.com/signup/v2/webcreateaccount

Instead Google requires a means of verifying the person that goes beyond simply "solving a captcha". The Phone Number field might say "(optional)": google-account-signup-phone-number-required-optional But when you attempt to submit the form, it's automatically rejected until you input a valid number.

We intend to use email verification ("double-opt-in") for our contact forms. When the person clicks the link in the email we send them (which validates their email address) we .then present them with one more "hoop" to jump through (we will ask them a basic qualitative/qualifying question like: "how did you hear about us?") which if they ignore, we know they aren't really interested in interacting with us and the "contact us" gets marked as "email verified. lead source not qualified".

We aren't going to even attempt to use any sort of AI/ML on the data until we have a few hundred thousand instances and we can dedicate a team to solving it.

The other reason we need to "wean" ourselves off using Google Forms is not a good "team workflow" and storing personal data in a Google Spreadsheet is not GDPR compliant as discussed in: https://github.com/dwyl/learn-to-send-email-via-google-script-html-no-server/issues/217

dwyl / contact

Contact form spam #6