Closed tigerrabbit closed 1 month ago
For now, we don't support multimodal models server, you can try nexa run llava1.6-mistral
.
If you want to host local server for multimodal models, we will improve our sdk to support multimodal models server in the future.
+1 for multimodal models server support
I am in love with the product. I was impressed with the coding when I dug into this using the editable install option. I am going to be keeping up with this and look forward to this option. I am sure there are some gotchas that I am not even aware of in implementing this feature.
we will include this in our roadmap
+1 for multimodal models server support
Thank you, we are priotizing this and it expects to be delivered in next 2 weeks.
@tigerrabbit @JGalego Hi,
I'm glad to announce that your feature request has just been fulfilled in #154 . Now you can try the VLM api with route /chat/completions. Note that a VLM model should be loaded (like llava1.6-vicuna) and the request body should be well formatted for multimodal input.
To get more details, you can refer to our CLI docs here: CLI
Keep yourself updated to our release in the near future, when this update will be available through pip install.
Thank you,
Perry Nexa AI
Issue Description
when running server llava1.6-mistral - unable to load model nexa server llava1.6-vicuna objc[12203]: Class GGMLMetalClass is implemented in both /Applications/Nexa.app/Contents/Frameworks/libggml_llama.dylib (0x12083c260) and /Applications/Nexa.app/Contents/Frameworks/nexa/gguf/lib/libstable-diffusion.dylib (0x1230a0260). One of the two will be used. Which one is undefined. INFO: Started server process [12203] INFO: Waiting for application startup. 2024-10-02 12:55:30,162 - INFO - Model Path: llava1.6-vicuna Model llava-v1.6-vicuna-7b:model-q4_0 already exists at /Users/gg/.cache/nexa/hub/official/llava-v1.6-vicuna-7b/model-q4_0.gguf ERROR: Traceback (most recent call last): File "starlette/routing.py", line 693, in lifespan File "starlette/routing.py", line 569, in aenter File "starlette/routing.py", line 670, in startup File "nexa/gguf/server/nexa_service.py", line 343, in startup_event File "nexa/gguf/server/nexa_service.py", line 203, in load_model ValueError: Model llava1.6-vicuna not found in Model Hub
ERROR: Application startup failed. Exiting.
Steps to Reproduce
nexaai server llava1.6-mistral and compare to nexaai server llava1.6-mistral
OS
mac os latest
Python Version
3.12.6
Nexa SDK Version
latest
GPU (if using one)
mac metal