supabase-community / postgres-new

In-browser Postgres sandbox with AI assistance
https://database.build
Apache License 2.0
2.34k stars 183 forks source link

feat: limit message context sent to llm #66

Closed gregnr closed 2 months ago

gregnr commented 2 months ago

Problem

In an ideal world, we send as much context to the LLM as possible so that it has the most amount of information to help respond to your questions/tasks. Unfortunately token costs grow quadratically for every message added to the chat, since you need to send all previous messages each time.

Solution

To prevent token costs from blowing out of proportion, we can limit the max number of previous messages to send to the LLM as context each request.

Important: this does not mean the actual message history is limited on the frontend. This is strictly referring to the sliding window of the past X messages being sent to the LLM as context each time.

This PR sets this message context limit to 30 messages. The consequence of this change is that the model will have no memory of messages older than 30 messages, so will be unable to answer a questions or refer to information from more than 30 messages back.

30 messages was chosen as a reasonable amount of context to help the user accomplish whatever task they are working on in that moment, but not be too much that irrelevant/stale messages are sent every time (adding large costs) with little value.

Future

In the future, we can consider summarizing old messages before trimming the context so that the model at least has some history to refer to. Rewriting message history can get tricky though, so this will need more thought.