StoneSteel27 commented 3 years ago

What?

As a moderator, it's difficult to keep up with everything happening on the server. Even if we have some automation in place, it's still hard to sort through every post.

I am proposing a system that would automatically detect when someone is breaking the rules or scamming and call moderators to the chat where the rule break is happening.

Justification

Moderators have a good understanding of the situation and rules in place
The Ai can also be used to ping moderators on channels where they are not currently active, for example, if there is no moderator present in the channel.
This leads to more consistent moderation, fewer rule-breaks, and a more enjoyable experience for everyone.
This also helps keep the moderators from feeling overworked or underappreciated.
The Ai will help with moderation, but it will not take the place of a human moderator - it is just there to be an extra help, and should not be considered as an alternative to actual humans.

-If you have any questions, concerns, or feedback about this Ai system, just add a comment

Implementation

We are going to implement gpt-2 AI, which is trained by millions of lines from the internet by openAI, in this system
The Ai cannot punish users, this is up to the moderators.
If a rule-break has already happened and the moderators are still active in the channel, then they will manually handle punishment.
If a rule-break has been happening for a while and no moderators are present, then the Ai will ping them to the channel, so warnings can be given out.
In off-topic channels, AI will only look for spamming and bullying.

Resources

This is the video link for openAi gpt-2 abilities. the half explains about gpt-2, and the other half is gpt-3
This is the link to gpt-2 license

lemonsaurus commented 3 years ago

I don't understand what problem this solves. Our moderator team are already detecting bad behavior within seconds or minutes of the incident through our existing mechanisms. This might offer a marginal improvement on response times (in the ideal case where it isn't riddled with false positives), but is that worth investing development time, maintenance cost and an increase in complexity into a feature like this for?

In my opinion, it is not.

swfarnsworth commented 3 years ago

Are you familiar with our current moderation toolkit? I can't tell if you want us to transition to an entirely new toolkit that leverages gpt-2 all throughout, or if you are only suggesting that we leverage gpt-2 to determine when a given user is attempting to scam another user.

It sounds like a lot of the tooling you might be suggesting already exists, but simply isn't as "smart" as one that would leverage gpt-2. To take an example regarding scams: moderators get a ping in a private channel if someone posts a link to a Discord server that isn't on our whitelist, and most of the server links that this feature brings to our attention are in fact unwanted. While we haven't exhaustively read every message sent in our server to ascertain how many nitro scams that feature has identified (which is what we'd ultimately have to do to fully evaluate the performance of a gpt-2 solution with respect to negative samples), I think it is likely identifying a large majority of them.

We're also able to capture a lot of other unwanted behaviors with a relatively simple ruleset that triggers a ping in private moderator channels. We get pings when certain words are said, several messages are sent rapidly by the same person, or a link to an inappropriate website is posted. Having an AI intended to capture these same behaviors probably wouldn't buy us that much more accuracy than the current tooling, and might even be worse.

From an academic standpoint, I'd be interested in how we could use AI to capture less obvious unwanted behaviors, such as when some number of users are having a heated argument without saying any words that trip the aforementioned filters. However this would require a lot of research and development and would ultimately cost us time over relying on reports from users and members of our staff, which do not take very long to write.

TLDR: We're able to bring lots of unwanted behaviors to the moderators' collective attention with our current toolkit, and it's very unlikely that developing an effective AI solution for the problems that our existing toolkit solve would be worth the time investment.

Leterax commented 3 years ago

when some number of users are having a heated argument without saying any words that trip the aforementioned filters

This is something that AI is actually pretty good at, it's called sentiment analysis and has come a long way in the last few years! There are pretty large datasets and pre-trained networks already out there. The use case for this is normally that of companies trying to find out how people feel about their products on social media etc. I'm not too sure how effective it would be at seeing the difference between "Python sucks" (perfectly valid, might just be sarcasm/frustration) and "You suck" as the difference can be very subtle. From a research point-of-view, it would be super interesting to find out how well it works though.

mbaruh commented 3 years ago

Hello, As mentioned the development costs (which are beyond simply training a model - which already has high costs), and the resources it requires (which, despite what may have been claimed in the #community-meta channel, are significant), create a technical debt that we are not interested in. This feature is not in-line with our current moderation philosophy, and the success for such a feature is up for debate.

Therefore, we have decided not to go with this idea. Your motivation and desire to help is not lost on us, but such a feature does not fit this community at this point, and we have other, higher-priority projects we want to focus on.

python-discord / meta

Feature:A.I. to help moderators moderate rule breakers, scammers and spammers #82

What?

Justification

Implementation

Resources