Closed 52cs closed 8 months ago
Hey, thanks for taking the time to look through the codebase.
The summarizing should be transparent to the LLM because it's an unrecoverable error -- there's nothing the LLM can do by dfn of its context window limitations -- if you cannot summarize properly (vs. not being able to run a function successfully).
hehe
There is no
finish_reason
in./memgpt/local_llm
which is support inopenai
which is also used in this repo:used here: https://github.com/cpacker/MemGPT/blob/main/memgpt/agent.py#L148-L150
This exception will finally call the
summarize_messages_inplace
and callstep
again. https://github.com/cpacker/MemGPT/blob/main/memgpt/agent.py#L1074-L1080But I think it is just a workaround. It is not perfect. But it works indeed. This job should be decided by the llm. Anyway, let's talk about it later.
Now, let's have a look at why the issue https://github.com/cpacker/MemGPT/issues/252 occurs. There is no
finish_reason: "length"
replied by./memgpt/local_llm
whenexceeds maximum context length
, instead, it raise an exception:which will not catch by
get_ai_reply_async
we expected, but byhandle_ai_response
._The
handle_ai_response
should record the function error message into the message history and feedback to llm in the next step. We expect the llm do the right decision, but it is not. Why? It should be the prompt engineering problem. Let's discuss it later._It's time to get the conclusion: To solve the issue of https://github.com/cpacker/MemGPT/issues/252 There is two approachs: Approach 1: quick fix. Make
./memgpt/local_llm
supportfinish_reason: "length"
instead ofraise exception
. Approach 2: prompt engineering. Make./memgpt/local_llm
raise more comprehensive exception as prompt into the message history which feedback to llm in the next round for decision.Finally, when Approach 2 works, let's remove the
length
catch inget_ai_reply_async
and remove thesummarize_messages_inplace
. Wait a minute, who knows the compact messages good or not. Only the experiments can tell us.