Suggested quotes spam - Githubissues

rexim commented 6 years ago

Recently we started to recieve spam in suggested quotes, which suggests that somehow spammers bypass our CAPTCHA. I heard that some time ago somebody found a way to trick that "I'm not a robot" CAPTCHA (cannot find the link though).

Does anybody have any suggestions to improve our spam protection?

ForNeVeR commented 6 years ago

somebody found a way to trick that "I'm not a robot" CAPTCHA (cannot find the link though)

Here it is: https://youtu.be/fsF7enQY8uI

ForNeVeR commented 6 years ago

It's possible that we have installed Recaptcha improperly and it somehow leaks captcha data and/or key. Maybe we're checking it wrong. Need to perform audit of the corresponding code.

ForNeVeR commented 6 years ago

Recaptcha admin panel tells me the following:

"our developer site" link leads to this page (and it has no useful info), all the graphs are empty and "Download Analytics" button downloads an empty CSV file. Nice.

rexim commented 6 years ago

@ForNeVeR welp, I guess the spammers helped us to find a bug :)

ForNeVeR commented 6 years ago

Do they actually? I don't think we have a usability bug in our integration. My current theories are:

either spammers are actively trying to break our recaptcha, and that leads to high amount of (invalid) recaptcha requests: honestly, we probably had about 2 spam quotes and 1 actual quote for the last month, that's not a bug but that's just true
or Google site is just bugged: it thinks we have no traffic and so we have 0/0 = Inf% passage rate

rexim commented 6 years ago

and that leads to high amount of (invalid) recaptcha requests

Can we start counting all of the requests on our side to confirm that theory?

hagane commented 6 years ago

There's always a possibility that spam quotes are submitted by hand. Do we have raw submission logs or something?

hagane commented 6 years ago

BTW, what is the content of spam quotes? Is it generic "Erlang your Postgres" spam, or something more specific?

ForNeVeR commented 6 years ago

We have two at the moment:

It’s exhausting to find knowledgeable people on this matter, but you sound like you know what you’re speaking about! Thanks [and a link to some site here, I won't visit it]

<a href=link to the actual site producing cranes>Мостовые краны производство и изготовление</a>

Looks like they're building up to something, don't you think?

ForNeVeR commented 6 years ago

Alright, I've analyzed our logs a bit and it looks like these're human assisted automated requests.

There're not many requests to POST /quote/new, but various bots visit our site quite often. I've looked spammers' IP addresses in Google and find that these IPs are registered in multiple so-called "online spam databases".

Not sure if we should take any action at the moment. One interesting thing is that spammers always use HTTP/1.0. Crazy idea: ban it?

rexim commented 6 years ago

Crazy idea: ban it?

I don't like this idea.

these IPs are registered in multiple so-called "online spam databases".

Can we ban everybody in these databases from loglist?

hagane commented 6 years ago

One interesting thing is that spammers always use HTTP/1.0. Crazy idea: ban it?

I like your way of enforcing progress, herr doctor. But @rexim offered a saner solution, methinks.

ForNeVeR commented 6 years ago

I believe there're some automated ways of banning addresses based on online IP databases. Maybe fail2ban or something like this?

rexim commented 6 years ago

@ForNeVeR I was actually thinking that maybe some of those databases provide some kind of REST API which we could integrate with.

ForNeVeR commented 6 years ago

Alright, spam has been stopped by itself. No action required for now.

codingteam / loglist

Suggested quotes spam #218