LAION-AI / Open-Assistant

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
https://open-assistant.io
Apache License 2.0
37.09k stars 3.24k forks source link

Early detection of possible bot spam #1878

Open AbdBarho opened 1 year ago

AbdBarho commented 1 year ago

As the amount of users we have grows, we might face malicious actors who would use bots to spam our API with "bad" data.

We need to have some sort of system to protect or at least detect such occurrences.

Possible solutions (non exclusive):

echo0x22 commented 1 year ago

What about User Agent (as external tool to check if it's the same user)

ArgiesDario commented 1 year ago

In case of tracking IP salted and hashed, what should we do if several users are coming from same IP? should we display a captcha for them?

mashdragon commented 1 year ago

Many of these ideas for solutions (restricting accounts or activity per IP address, tracking browser metrics such as the user agent and punishing users who deviate from some standard, using Cloudflare) become unnecessary barriers to anonymous users and rely on arbitrary assumptions, not to mention that they are trivial to bypass for a small cost against a determined attacker.

A system for managing spam that uses reputation and history as its foundation is going to be more effective. For instance, such a system might require all users with no reputation to perform captchas and eventually relax that requirement once they have proven not to be spammy by a large sample of older accounts.

echo0x22 commented 1 year ago

Adding some sort of CAPTCHA can be good for this: Good start is PoW (Proof of Work) CAPTCHA, GPLv3 https://github.com/sequentialread/pow-captcha Probably we can run this check every 3–5 tasks to test if there's a real user

Lo10Th commented 1 year ago

Proof of work wouldn't be a great option for people who got a bad pc

AbdBarho commented 1 year ago

Proof of work wouldn't be a great option for people who got a bad pc

I don't think they mean proof of work i.e. doing computation for the block chain, I think they mean solving some tasks.