Better Moderation Bot - Githubissues

GeekZoneHQ / web

Software to power the Geek.Zone website and apps

http://geek.zone/web

GNU General Public License v3.0

19 stars 29 forks source link

Better Moderation Bot #489

Open jamesgeddes opened 2 years ago

jamesgeddes commented 2 years ago

What's your idea?

Build or implement a bot that can moderate our forum and discord server according to the different levels of offensive language as stipulated by Ofcom; "Mild", "Moderate" and "Strong".

Brownie points

Could use human input, such reactions, to learn whether potentially offensive language is actually offensive within the context of the previous 10 messages.

Joe Bloggs said,

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

I am 78% certain that this comment is mildly offensive.

In this context, how offensive is this? :smile: not offensive whatsoever :unamused: mildly offensive :angry: moderately offensive :rage: strongly offensive

Once predictions start getting above 90% certainty, we could start letting the bot take action on messages.

Impact

High

Urgency

Later

Code of Conduct

[X] I agree to follow this project's Code of Conduct

jamesgeddes commented 2 years ago

I have asked Ofcom to publish their offensive language list in JSON on a permalink.

jamesgeddes commented 2 years ago

I have converted the Ofcom list to JSON myself. Hopefully Ofcom will publish their own JSON in the future so that we can automate this.

offensive_language-json.txt

GitHub does not support uploading .json files so rename this from

offensive_language-json.txt

offensive_language.json

jamesgeddes commented 2 years ago

On the assumption that Ofcom,

do decide to publish a JSON file
decide to not provide any update notification method

I see two main options that would allow us to get the latest version of the offensive language list without cron.

Monitor for changes

We could use a free change checker such as distill.io to monitor the file for changes and then update our copy on webhook. We should avoid using cron jobs unless absolutely necessary.

Advantages

We get the latest list as soon as it is published
We do not need to pull the list unnecessarily

Disadvantages

We are reliant on an external provider which may go bust.

Message trigger

We could update the list from Ofcom if all of the following criteria are met.

new message encountered
the list was last updated more than 28 days ago

Advantages

No need for external provider

Disadvantages

Pulls list even if there has been no changes - albeit infrequently.
Could allow us to be up to 28 days out of date.

jamesgeddes commented 2 years ago

This might be helpful

blackbird7958 commented 2 years ago

Just a few thoughts which can hopefully provide further guidance on this matter:

ML could potentially identify insulting/offensive text, regardless of if it contains bad language. However, the technology is extremely limited, and I strongly advise that it is used to notify human moderators only.

A good regex filter should be enough to identify explicit language, and so long as it is only used for words which should be prohibited regardless of context, it shouldn't cause problems.

What should never be done, is a bot (ML or not) with the ability to timeout/kick/ban people based on text that may not be offensive depending on context. Deep learning-based bots in particular should never have this ability, since it is impossible to provide assurances against false positives.