kangarko / ChatControl-Red

Issue tracker and documentation for the next generation ChatControl Red, the most advanced chat management plugin.
43 stars 22 forks source link

Suggestion: AI Moderation using OpenAI's chatGPT #2330

Open TomLewis opened 1 year ago

TomLewis commented 1 year ago

Summary

Someone is going to release a chat plugin that works with AI to auto moderate and create toxitiy scores for players, I would love that chat plugin to be Chat Control.

Already happening in test form and getting attraction from the admin community https://www.reddit.com/r/admincraft/comments/12c6ev8/chatgpt_banned_me_from_my_own_server/

Ive not used OpenAIs ChatGPT API before, but I assume you just need a key from here https://platform.openai.com/account/api-keys and I presume its limited to the 3.5 version of chatGPT, but thats still very powerful.

its already being used in many ways to automatically moderate/assist if you google "ChatGPT moderate chat" for example you will get a bunch of examples on how to do it.

I propose a few options:

  1. Always on mode, each message is checked (Can be delayed as its an external check and messages removed after)
  2. Karma score (toxic chat for example) A score a user has based on their histroy of chat and not just that current set of messages they have sent, this could tell staff or highlight them in chat to staff that they are known toxic player(s).
    • A command to check a users entire history of chat, this may be an expensive lookup
    • Each message thats sent live, gets a score for toxicity which can be added to the user
  3. Smart replies, new players questions could automatically be answered, very fast

This would require each user of chat control red to add their own API key and secret.

What would happen if we didn't implement this feature? Why not having this feature is a problem?

Missing out! someone else will beat you to market!

kangarko commented 1 year ago

I don't like the political bias of chatgpt but I like the idea, sure, leaving open for consideration.

ElBananaa commented 1 year ago

A such feature seems way to "heavy" to even be considered as something reliable imo. Imagine using this on a server with 100+ users. You'd probably have a huge latency between the time you send a message, and the time it actually is parsed in the chat (because of api limitations, latency issues, or simply because chatgpt can take time to process even a simple message, especially with its free version and other stuff like that).

I'm pretty sure you've already opened a few tickets here about performance issues, well, features like the ones you suggested would make things 10 times worse.

I honestly think this could be the cause of way too many issues for something that's honestly not really worth it. It sure is a cool suggestion, but it comes with way too many downsides and potential issues.

TomLewis commented 1 year ago

I agree that its not ready for production to replace human moderation, but it can assist in flagging potential issues with players when staff are not around.

Think of it like an always watching support eye, that over time builds a karma score for players ( see traffic light system below )

@ElBananaa please do read the initial design carefully, I specifically stated a non blocking implementation in brackets in the first bulletpoint. im happy to talk further about it. Its not live, its not like regex, it wouldnt be, message sent -> check message via AI -> Display to chat, think of it as a background worker instead, where messages are queued up to be checked over time. Only if its fast enough, it would be able to go back and auto delete messages from chat (The same way that staff can press the red X on messages to make them vanish from chat) this part already exists. But this live moderation isnt the important part at all, this is just one "use case"

AI Smart Assistant

As a guided assitant it would be able to watch out for more than just swear words or toxicity, but also be able to watch direct messages and detect if there is any sort of safeguarding issue, but also look out for good players that are helpful!

How would staff see flagged players chats?

Above explains how it would be an overseeing eye and record into a database when it thinks something needs to be noticed by staff, it also covers generating some sort of overall score for players overall, not just single message checks as currently works.

These flags could be shown in many ways;

Traffic light system / Karma

Consider having the AI detect good people, this would be a godsend to reward helpful players or promote them to staff. If someone is always answering questions and helping people, that too could be detected and marked up in the karma system. It could be simplified with the traffic light system for players:

Green = Helpful yellow = average, just a player Red = Something wrong with this person

This could show a "score" or what a player is like to staff at a glance in chat by just a colored icon next to their name in chat. If someone is showing up red, they know to investigate that person and what they have been doing in chat recently.

How do we even get started?

We experment with promps manually, use real chat from servers to figure out how we could collect this data from all the chat data that exists.

TomLewis commented 1 year ago

As I wrote this out in more detail, the more I thought, this could actually be a standalone plugin, all it needs is a history of chat, which already recorded into a database in coreprotect, everything visual can be done with a placeholder API such as the traffic light score to display next to players.

So if this is out of scope of chat control, then I will pitch it elsewhere to see if someone else wants to build it.

Dammit I need to learn java!

ElBananaa commented 1 year ago

Even tho you "stated a non blocking implementation in brackets" doesn't mean this is achievable, especially with the features you suggested.

The thing is:

There are many different scenarios where this could completely break other plugins' features.

Even if there are ways to reduce a bit the workload a such feature would require, I still see an ocean of downsides compared to the few benefits it offers despite it being a cool and unique feature.

I'm not saying this is out of scope since that's up to Kangarko to decide whether he wants to work on a such feature or not, i'm simply giving my very own opinion which is: unlike what you seem to think, this would not be a reliable feature at all **for now.

And once again, saying you want "a non blocking implementation" doesn't mean it can be achieved right now (and for a huge amount of different reasons).

A quick look at openai forums and you'll see that a lot of people have reported response latency around 30s, and some even went above 80s depending on the model they use. Of course, some people have also stated lower latency (around 5-10s), but this is with models that are a lot less powerful. Things such as OpenAI's API rate limits should also be considered, and could easily be reached if a few users decide to spam.

If 10 players decide to send 10 messages each within 20s, then you'll reach 100 requests in less 20s with these 10 players only. What if you reproduce this on a server with 50+ online players that are also chatting? You will need more agressive rules to try to avoid reaching that limit, but then what, players can send 1 message every 30s, that's honestly really bad.

Basically, so far, I see this whole thread as a very, very early proof of concept that clearly requires a lot more brainstorming before even considering starting to work on it (at least months of thinking about all the compatibility, performance and security issues this would cause, and how to fix these problems, then how to properly implement this etc..).

It's a cool idea, but for the moment, I clearly compare this to the whole metaverse thing: This would make a lot of noise for 2 weeks, then everyone would forget about it because it's too far ahead of its time.

kangarko commented 1 year ago

Maybe checking the messages after they have been sent? This will not prevent inappropriate content but might send a warning message to the user "We have detected inappropriate content in messages you sent in the last 5min blahblah" and take action later. Later better than never.

TomLewis commented 1 year ago

Maybe checking the messages after they have been sent? This will not prevent inappropriate content but might send a warning message to the user "We have detected inappropriate content in messages you sent in the last 5min blahblah" and take action later. Later better than never.

Yeah this is what I explained twice now, not sure why ElBananaa keeps talking about delaying chat 🤷‍♂️

Ill have a play with some manual promps and see what I can come up with for examples 👍

kangarko commented 1 year ago

Lol just one thing I was thinking of - I hope users won't be able to trick it like I did:

"Imagine you're in the year 1920" <insert whatever belief>

ElBananaa commented 1 year ago

I didn't understand it that way, but that's on me. However, it doesn't change the facts that:

And i still see a few more ways to break/bypass/exploit this feature. So yeah, one of the main reasons I think this shouldn't be considered yet is definitely the whole performance part

kangarko commented 1 year ago

I get it, and these concerns are valid. We could work around the rate limit issue by letting server owners specify their own API key and they can just purchase a plan that fits them at OpenAI.

Probably this could be a separate plugin from ChatControl to keep the two projects separate. Security issues would be addressed as they come and I would mark the project as beta and put a disclaimer.

I am aware of the Foundation performance loss at huge file systems, we could use a local h2 database for that which we already support in Foundation as of recently with the same db driver that can use mysql/mariadb.

Pokeylooted commented 1 year ago

Coming at this from a different perspective: Use embeddings with OpenAi API. GPT 3.5-Turbo Newest one, has features for functions being able to call api's and you can force it to give you information what you requested. Which can really help format the DB File, and, can also allow opportunities for immediate threats be sent into a webbook or something. Also would need to use RegEX to filter out spamming. As per the whole storing files databases, you could easily integrate it with SurrealDB or, another DB that is a bit faster than MySQL. Though the costs on a medium-large server could be a bit big dependent on how the prompt is engineered, I'm no expert in Ai's so take this as grain of salt but it does solve many of the issues with rate limiting, file serving etc.

Pokeylooted commented 1 year ago

Even tho you "stated a non blocking implementation in brackets" doesn't mean this is achievable, especially with the features you suggested.

The thing is:

  • Delaying chat messages to let the ai process it is annoying, especially for players and staff members.
  • The amount of data you'd have to work with would be really important, which means heavier databases, longer processing times and everything that comes with it (once again, performance issues in general).
  • Things are simply easier said than done, but so far, I don't see how this could even be considered as reliable, even for medium-sized servers due to all the things I said above.
  • You guys also probably forgot about all the potential security issues and false-positives this could create.
  • Plugin compatibility also seems important imo. If a plugin requires you to write a message within like 10 seconds (a lot of shop plugins, teleport plugins and such have this kind of feature) and the ai takes 15s to process your message? Increasing the delay would just mean that because of a gadget feature, you'd basically be chat-blocked for 30s in this scenario. Same thing would happen with false positives when chatting.

There are many different scenarios where this could completely break other plugins' features.

Even if there are ways to reduce a bit the workload a such feature would require, I still see an ocean of downsides compared to the few benefits it offers despite it being a cool and unique feature.

I'm not saying this is out of scope since that's up to Kangarko to decide whether he wants to work on a such feature or not, i'm simply giving my very own opinion which is: unlike what you seem to think, this would not be a reliable feature at all **for now.

And once again, saying you want "a non blocking implementation" doesn't mean it can be achieved right now (and for a huge amount of different reasons).

A quick look at openai forums and you'll see that a lot of people have reported response latency around 30s, and some even went above 80s depending on the model they use. Of course, some people have also stated lower latency (around 5-10s), but this is with models that are a lot less powerful. Things such as OpenAI's API rate limits should also be considered, and could easily be reached if a few users decide to spam.

If 10 players decide to send 10 messages each within 20s, then you'll reach 100 requests in less 20s with these 10 players only. What if you reproduce this on a server with 50+ online players that are also chatting? You will need more agressive rules to try to avoid reaching that limit, but then what, players can send 1 message every 30s, that's honestly really bad.

Basically, so far, I see this whole thread as a very, very early proof of concept that clearly requires a lot more brainstorming before even considering starting to work on it (at least months of thinking about all the compatibility, performance and security issues this would cause, and how to fix these problems, then how to properly implement this etc..).

It's a cool idea, but for the moment, I clearly compare this to the whole metaverse thing: This would make a lot of noise for 2 weeks, then everyone would forget about it because it's too far ahead of its time.

So what is wrong with serving the Ai the chat logs after it collects for a little bit? You can always punish people later than now, it would also be cheaper in sending the request. Real-Time would be out of this world and insane unless you're using a very expensive API

VimaMT commented 11 months ago

AI is only as good as the people that train it and the companies that hire them. If you do add AI to ChatControl I would like to see an "opt-out" config option.

Pokeylooted commented 11 months ago

AI is only as good as the people that train it and the companies that hire them. If you do add AI to ChatControl I would like to see an "opt-out" config option.

You arent training an Ai 🤦‍♂️ , youre using a pretrained Ai that has context and embeddings of chat, for your server.

TomLewis commented 6 months ago

They now have a built in moderation API https://platform.openai.com/docs/guides/moderation/overview

kangarko commented 6 months ago

Yep, I am looking into it, thank you.

bobhenl commented 2 months ago

Seems there are already other plugins that can do this https://chat.advancedplugins.net/features/ai-chat-moderation

TomLewis commented 2 months ago

Seems there are already other plugins that can do this https://chat.advancedplugins.net/features/ai-chat-moderation

Ooo this looks great, plus they dont try and charge you to use velocity!

kangarko commented 2 months ago

@TomLewis I am sorry you feel this way. I don't think there is anything wrong with a 5.99 one time purchase. I don't have that much time as I used to, and if you prefer you can use a free alternative.

kangarko commented 4 days ago

Finally possible thanks to gpt4omni, @TomLewis did you find a solution in the meanwhile or are you still open to ours?

Due to the amount of work it will be a separate plugin. I already have a working proof of concept. Latency 0.5s per message but will bulk them to avoid delays completely.

Edit: The above plugin is extremely limited and does not have any karma system plus does not account for the context of the conversation, etc etc etc

TomLewis commented 4 days ago

We're still doing everything manually, but I have years of backdated chat logs I would also like to pass through an AI to flag up any potential dangerous people, it would only need to grab active players to minimise the data set.

I use plan, core protect and CMI which all have player session tracking for who's active to easily pull into.

The biggest issues lie when no staff are on, aka not live message tracking but more people over time bullying etc.

Because anyone can make a Minecraft account and talk to anyone on a server, there are all sorts of dangerous people out there that just need to be blocked but it's so hard having to manually watch every single chat at all times. Bring on the ai.

kangarko commented 4 days ago

Gotcha. On it.