Closed simon-krannig closed 2 months ago
Just realized that multimodal support got dropped (temporarily) for llama-server.
Any updates on when this feature will be available again?
Would also love to know
Do we know what commit or PR support got dropped?
Latest release to support LLaVA with server was B2356
So I can not use moondream2 or Bunny VLM's with the server? What is the alternative?
create a directory LMStudio/models/WHATEVER/local/moondream2/ copy moondream2-mmproj-f16.gguf and moondream2-text-model-f16.gguf into it. Select Alpaca for preset.
Profit.
@ali0une Thanks, but I need this as a server, could that work?
@ali0une Thanks, but I need this as a server, could that work?
KoboldCPP maintains LLava support.
This issue was closed because it has been inactive for 14 days since being marked as stale.
What happened?
Hey everyone, i am currently trying to set up llama cpp server with a llava vision model. When using the llama-llava-cli, everything works just fine:
However, i cant get the same functionality to work when using llama.cpp server. Running this command ...
... and trying to use the image with the llama.cpp server frontend results in the error message:
Trying to use the completion or openAI style chat api with images in payload results in the image simply being ignored entirely (base64 or url, doesn't matter) The server logs also show no indication of CLIP being used. Does anybody have an idea what might be going on? This reddit thread from months ago shows this functionality in action.
Name and Version
./llama-llava-cli --version version: 3417 (3d0e4367) built with cc (Ubuntu 13.2.0-23ubuntu4) 13.2.0 for x86_64-linux-gnu
./llama-server --version version: 3417 (3d0e4367) built with cc (Ubuntu 13.2.0-23ubuntu4) 13.2.0 for x86_64-linux-gnu
What operating system are you seeing the problem on?
Linux
Relevant log output