danny-avila / LibreChat

Enhanced ChatGPT Clone: Features Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. Actively in public development.
https://librechat.ai/
MIT License
19.4k stars 3.24k forks source link

[Bug]: Saved conversation structure collapse #4761

Closed quacrobat closed 6 days ago

quacrobat commented 1 week ago

What happened?

I had a long conversation saved last week. That conversation had occasional branching where i edited my questions. Never more than a couple of branches wide, max 3 or 4. Now it is showing as if the entire conversation has one message at the trunk, and 42 branches on the same level. When i click on the < and > to navigate between these branches, it actually changes the root message, which is sometimes mine sometimes the LLM's answer!

I tried on both local / running on docker and on hugginface. They're both using the same DB so i guess there is corruption at the DB level. WHICH IS VERY SCARY.

image

Steps to Reproduce

I noticed this today I tried to find a message in that conversation so searched for a term in the search bar, it listed the relevant messages and the correct conversation. When I clicked on it, I found that the normal tree structure of the conversation had totally collapsed into a single trunk with 42 branches.

What browsers are you seeing the problem on?

Chrome

Relevant log output

The log shows 200 messages loaded. but i don't see them except by navigating using the < 40/42 > prompt paginator.

Screenshots

image

Code of Conduct

quacrobat commented 1 week ago

PS, i had updated to latest version on local a few days ago. but the hugging face hasn't been updated (unless it does it automatically when i rebuild). they both present the same issue. When i update the local/docker version to latest version, i still see the same issue.

quacrobat commented 1 week ago

Here's a sketch of what the structure of the structure i see, i hope it makes sense!

WhatsApp Image 2024-11-19 at 22 16 13

nickmahdavi commented 1 week ago

Also having this issue — not sure if it's number of messages or chat lengths, but I noticed it kicking in with one of my chats every time it hit 20 messages or so. The collapse seems to happen if I export or fork the chat but not in the original thread. Although what does tend to happen in the original thread (past another length threshold) is #3813, likely related.

I also don't notice any corruption that happens in the actual JSON, say if I export the original chat.

xyqyear commented 1 week ago

I'm also experiencing the same issue. It seems that if you fork when there are more than 16 messages (user + assistant), the newly forked conversation will be messed up.

I've come up with a pretty consistent way to reproduce:

  1. keep the conversation going for 9 rounds. The easiest way would be in the system prompt tell it to just repeat whatever the user sent. image
  2. fork the chat at the last response (or the 9th user query).
  3. the newly created conversation will be messed up.

There would be no problem if you fork at any point before that.

Another thing I found out is that if the original conversation has branches, the original conversation might get corrupted at well. Let's say after I had 8 rounds of conversation, I change the 8th query to another message, then I created a branch with a new response, then the total number of message in this conversation will exceed 16. In this case, if I at any point create a fork, the ORIGINAL conversation will be messed up as well. If the newly created conversation contains more than 16 messages, it too will be messed up.

So here is my summary:

  1. if there is no branches in the original conversation, then the original conversation will be left alone when forking.
  2. if the newly forked conversation has more than 16 messages, that new conversation will be corrupted.
  3. if the original conversation has any branching, and has more than 16 messages, the ORIGINAL conversation will be messed up when forking. Rule 2 applys too for the new conversation.

Edit: After further investigation, I found that when a conversation has more than 16 messsages and has any branching going on. When refreshing the page, this conversation will be messed up.

xyqyear commented 1 week ago

Continuing the conversation at https://github.com/danny-avila/LibreChat/pull/4772#issuecomment-2493833729

After some investigation, I found that the message list is sorted by createdAt in https://github.com/danny-avila/LibreChat/blob/7d5be6874790655e9060ed93dc67cf7c9a0fad52/api/models/Message.js#L253-L259

But after forking, all the newly created messages will have the createdAt field updated to the current time (is this intentional? @danny-avila ), so when querying, the ordering of the message list is unpredictable. So in buildTree.ts, the built tree is also incorrect.

My guess is that mongodb only guanrantees the insertion order when qeurying small number of objects, hence the magic number 16.

quacrobat commented 6 days ago

Fantastic work @xyqyear - and I realize I had a similar bug in a different project due to createdAt in mongodb... @danny-avila hopefully this mean this is a front-end issue only and once fixed the corrupted conversations will appear correctly again?

danny-avila commented 6 days ago

Fantastic work @xyqyear - and I realize I had a similar bug in a different project due to createdAt in mongodb... @danny-avila hopefully this mean this is a front-end issue only and once fixed the corrupted conversations will appear correctly again?

Unfortunately it's a backend issue. The folding conversations can be fixed, but it would be a lot of overhead to fix the issue retroactively. Maybe I can make a script to fix specific conversations?