[BUG]: Vision models don't retain memory of images past one prompt

Mintplex-Labs / anything-llm

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.

https://anythingllm.com

MIT License

27.48k stars 2.77k forks source link

[BUG]: Vision models don't retain memory of images past one prompt #2585

Open sheneman opened 2 weeks ago

sheneman commented 2 weeks ago

How are you running AnythingLLM?

AnythingLLM desktop app

What happened?

When I upload a file, I can use a vision model like llama3.2-vision:11b to describe it, but then subsequent prompts don't have any memory of the image.

I would expect that I can ask repeated questions of the image and that it would remain in my current context until my context window was exhausted.

Are there known steps to reproduce?

No response

nekopep commented 1 day ago

Tried on 1.2.3 docker version, pixtral 12B, and worked. <--- correction , no in fact I was lucky , see next comment... At least on docker version looks right.

I just tried on 1.2.4 desktop version, pixtral 12B and it fails:

Looks like if data is not on first description he hallucinate colors... (on desktop version)

nekopep commented 1 day ago

Correction seen exact same behviour on docker :), 1.2.3, pixtral 12B.