Closed yesbroc closed 1 year ago
how do i use it?
--stream-responses
The same option is available within config.yml too.
yup, let me know how it works for you! The challenge with this approach is that Discord will rate-limit its api responses, so depending on how long your responses are and how fast they're generated, you'll see varying results.
Thanks @jmoney7823956789378 for the tech support! :)
No problem! Always got some free time at work. Do you think you'd be able to expose the "streaming rate" to be adjusted in the config? I'm still getting a big pause after about 10-15 tokens, and then a big catch-up following. This is with 65B streaming at ~7t/s. 30B at 12t/s, etc. I think we would be fine with a per-token or a per-second update rate.
Sure thing! Added in https://github.com/chrisrude/oobabot/commit/bf23d741d60f1bcfae1c9b58afa90f017e7af61e
If you try it, please lmk what values you find useful. I would like to have defaults that work well for most users.
Sure thing! Added in bf23d74
If you try it, please lmk what values you find useful. I would like to have defaults that work well for most users.
Awesome stuff! I've been messing with it a little, but it seems like it'll still cause some strange behavior with context if it errors out. When it encounters the error below, it will completely ignore the message for further conversation. Here's the most common error I get consistently.
Tewi says:
Cyber Forensics involves investigating digital devices or data to uncover evidence related to crimes or other incidents.
---TRUNCATED LENGTH---
Offensive cyber capabilities may be employed as part of a nation's military strategy, intelligence gathering efforts, or law enforcement operations.2023-06-22 18:27:53,819 ERROR Error: 400 Bad Request (error code: 50035): Invalid Form Body
In content: Must be 2000 or fewer in length.
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/oobabot/discord_bot.py", line 382, in _send_text_response_in_channel
last_sent_message = await self._render_streaming_response(
File "/usr/local/lib/python3.9/dist-packages/oobabot/discord_bot.py", line 538, in _render_streaming_response
await last_message.edit(
File "/usr/local/lib/python3.9/dist-packages/discord/message.py", line 2178, in edit
data = await self._state.http.edit_message(self.channel.id, self.id, params=params)
File "/usr/local/lib/python3.9/dist-packages/discord/http.py", line 744, in request
raise HTTPException(response, data)
discord.errors.HTTPException: 400 Bad Request (error code: 50035): Invalid Form Body
In content: Must be 2000 or fewer in length.
This is probably because I have the max_new_tokens
set to 500, causing way too many tokens to be resent back and forth for editing the message.
I'm also seeing my oobabot server (lxc container on separate machine) just pausing mid generation.
This is the raw receiving-end from oobaboga, using --log-all-the-things
.
Not sure what the cause is with that, or if you're able to reproduce.
Here are my findings:
It looks like stream_responses_speed_limit: 0.7
seems to work great without any pauses or stops.
Line-feed and carriage return characters dont seem to trigger bot to split responses while streaming is enabled. This would help with the overload in http request size.
Also, for some reason when using stream_responses
oobabot does not read the message as context for the next message.
Glad it worked! I'll test out 0.7 and see how it works as a default setting.
Also, for some reason when using stream_responses oobabot does not read the message as context for the next message. Thanks for pointing this out! I hadn't noticed this before. I'll see if I can address it.
Invalid Form Body In content: Must be 2000 or fewer in length. Thanks for pointing this out as well; I hadn't seen this before as my responses tend to be shorter in length. But it should be possible to split longer streaming responses across messages.
Can this be pushed trough to the "plugin" version 🙏
would love to set the stream_responses_speed_limit: to something that doesnt hit the api wall.
Yup, working on it soon! It's been a minute since the last release, and I have a lot of improvements that need to get out. :)
I'm not planning to give the speed limit itself a UI widget, but on release you'll be able to change it via the advanced yaml editor.
Yup, working on it soon! It's been a minute since the last release, and I have a lot of improvements that need to get out. :)
I'm not planning to give the speed limit itself a UI widget, but on release you'll be able to change it via the advanced yaml editor.
I'm still testing out your latest 0.2.0 changes, and I'm finding that using stream-responses: True
still causes the messages to be omitted from chat history.
Yup, I haven't fixed that part yet. I split off bug 63 to help me keep track of it. Going to close this for now.
Already implemented, though a little jank at times.