Filter vector search results in separate temporary context to reduce main context size

@FreePhoenix888 I mean we can make not one, but two requests to GPT-4 for single message. The first request will filter data we got from vector search, and the second request will use that filtered data to answer user's request. This task does not have high priority and can be done in the fork of that repository. The only effect from this task it reduced size of each message in the context. That means we can send history of messages to GPT-4. At the moment we get a lot of results from vector search, and right now bot supports only history/context that contains single message. Completion of this task will enable users to make continuous conversations with a bot. For example, to clarify details of the case in each next message.

deep-foundation / russian-laws-bot

Filter vector search results in separate temporary context to reduce main context size #4