knrd1 / chatgpt

ChatGPT IRC bot
https://github.com/knrd1/chatgpt
MIT License
33 stars 7 forks source link

Flood prevention #28

Closed mkayhko closed 9 months ago

mkayhko commented 9 months ago

Hello,

especially when the bot gives numbered lists, they all go on separate lines causing excess flood kick from the server. It can be a problem especially if being used on several channels. Some kind of anti-flood measures would also help.

knrd1 commented 9 months ago

Hey, thanks for your message. I'm running the bot on two IRC networks and they don't have a problems with my bots posting multiple lines in short intervals. Can you please tell me what IRC network you are talking about so I can have a look and potentially apply a fix?

As a temporary measure, you can change variable max_tokens in chat.conf. By default it's set to 1000 but you can limit it to whatever you want. That means the bots will give you less but still try to answer your questions. By the way, 1K tokens is very low limit, current GTP 3.5 models support 4K tokens while GTP 4 - up to128K tokens, please see for more details: https://platform.openai.com/docs/models/gpt-3-5

mkayhko commented 9 months ago

Hello, thanks for your reply. I'm running on IRCnet. For privacy and cost reasons I'm using a local language model (Zephyr 7B) running on oobabooga text generation webui with the openai api extension. When using a larger model that doesn't fit entirely on my GPU (slower inference rate) it doesn't seem like the bot is flooding itself out though.

I'm actually experimenting between 25 to 150 tokens to avoid flood kick, but it often just cuts the answer short.

Maybe an option to concatenate all lines and account for the maximum message length on the network? A fixed delay between each new line would probably be easy to implement.

knrd1 commented 9 months ago

Thanks for clarification. It's easy to implement delay by adding "time.sleep(1)" function. Heres an example of how you can modify the code to introduce a 1-second delay between messages: https://pastebin.com/7YEV20p2 Specifically, I added lines 61 and 126. You can change 1 second to whatever you want.

It's rather unusual issue so I won't add this to the main code at that time.

Our of curiosity, what Transformers models do you recommend to use with oobabooga? I was only experimenting with Vicuna so far.

mkayhko commented 9 months ago

Thanks, I'll see how it works out. I've been using TheBloke's quantizations of these models available at huggingface.co:

Zephyr 7B beta (seems to produce numbered lists on separate rows) dolphin-2.5-mixtral-8x7b

These are based on the new models released by Mistral AI, that are supposed to be some of the best open weight models currently.

knrd1 commented 9 months ago

Thanks, I hope the workaround with function time.sleep() works for you. I will close this thread.