Open desentizised opened 5 months ago
I've played with shortening the characters limit but as you've mentioned there are still situations where it goes over. I've typically used the bot in my server for short response and haven't had an issue, but will gladly welcome a Pull request.
When I was using GPT 3.5 I had the token limit at 300 and I assume that just kept OpenAI from allocating the resources on their hardware that would lead to >2000 character responses in the first place. I might try lowering that again.
When you say you use your bot for short responses, are you specifying this as an explicit instruction? That's how I currently work around it. To just append "Answer briefly." to everything.
While I have a background in Software Engineering I'm not sure if I could be of much service in terms of contributing here. My assumption was that the reply
on line 79 ends up being whatever OpenAI sends back and then that could theoretically be split up into multiple ctx.send()
calls if need be. I take it it's probably not that easy if you've already played around with the problem.
As a workaround I've truncated the message if it goes over 2000 characters (3d56510). I may explore in the future either splitting up the response into multiple messages or adding the long reply as an attachment.
As you mentioned playing with the token limit can help, but since tokens aren't an exact 1:1 mapping to character counts it takes some experimenting to get right.
I've been experiencing this problem for a while now and with the switch to GPT 4 it's become more prevalent.
What happened:
What should've happened: The bot sends the response it gets from OpenAI. Possibly by splitting it up into multiple messages?