hiyaryan / the-cdj

The Cognitive Distortion Journal (CDJ) is a smart journaling tool that helps remedy distorted thinking. It can feel impossible to follow the CBT technique of labeling distorted thinking and finding alternative modes of thought (i.e. reframing) while cognitive distortions are occurring. The CDJ does that work for you. -- The CDJ is in beta testing!!
https://thecdj.app
3 stars 0 forks source link

Summarize messages. #94

Open hiyaryan opened 7 months ago

hiyaryan commented 7 months ago

Long conversations should be summarized. This will do two things, reduce the token cost and reduce the possibility for a hallucination. The more context the LLM has to work with and process, the more likely it's going to make something up.

gpt-4 model has been excellent in terms of chatting. I have personally had very good conversations with it in the context of my own journal entries. This model should still be used, but it is not good with returning json. The 1106 models, which are slower, are significantly better. There has not been a single instance of it returning the wrong data structure.

This means that it most likely require another api call to an 1106 model after some number of chat messages are sent. This can be based on the token count. The goal is to keep the chat completions under some token amount. When it reaches the number, 1106 should be utilized to summarize the chat. This is based on the users config for their journal. Then the context sent to the gpt for chat, will be the summary.

This will require careful prompt engineering to ensure that the LLM summarizes the conversation in a way that still makes the experience with it conversational as opposed to descriptive.

Consider if this is actually going to save on token cost. If the same amount of messages are being sent to gpt-4 with additional calls to 1106 to summarize, it may be using more tokens. Assuming that it does, reducing the hallucinations and maintaining the quality of the conversation is worth the cost.