LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
4.98k stars 349 forks source link

LLaVA multimodal projection stops working after New Session #821

Closed iScriptLex closed 5 months ago

iScriptLex commented 5 months ago

Kubuntu 22.04, GeForce 4090. KoboldCPP version: 1.64 Models: kunoichi-7b.Q8_0.gguf, BuRP_7B-Q8_0-imat.gguf or any other Mistral-based 7B model. Mmproj: mistral-7b-mmproj-v1.5-Q4_1.gguf Command line: ./koboldcpp-linux-x64 --model models/BuRP_7B-Q8_0-imat.gguf --usecublas --gpulayers 10000 --contextsize 4096 --preloadstory startup.json --mmproj mmproj/mistral-7b-mmproj-v1.5-Q4_1.gguf

  1. Add image for recognition (Add Img -> Upload Image File)
  2. Add text "Describe this image" to chat
  3. Click "Generate More".

Model describes image successfully: objects, composition, etc.

  1. After that, click "New Session" (this removes uploaded image and generated text)
  2. Add some other (NOT the same) image for recognition
  3. Add text "Describe this image" to chat
  4. Click "Generate More".

Model talk nonsense that has nothing to do with the uploaded image. All subsequent uploaded images are also not recognized. Image recognition starts working again only after the KoboldCPP is fully restarted.

KoboldCPP 1.61.2 works well and doesn't have this bug.

LostRuins commented 5 months ago

Thanks for reporting, it's a known issue, i'm trying to find the commit that caused it.

Linked: https://github.com/ggerganov/llama.cpp/issues/7060

LostRuins commented 5 months ago

Hi, please try the hotfix 1.64.1 and let me know if that works.

iScriptLex commented 5 months ago

Thank you very much, 1.64.1 works well.

LostRuins commented 5 months ago

Great. We can use my solution until upstream fixes it properly.