Open namchuai opened 1 month ago
Updates:
mmproj
param is added to /v1/models/start parameters in #1537We should ensure that model.yaml
supports this type of abstraction, cc @hahuyhoang411
@vansangpfiev and @hahuyhoang411 - can I get your thoughts to add to this list from my naive understanding?
To support Vision models on Cortex, we need the following:
@vansangpfiev and @hahuyhoang411 - can I get your thoughts to add to this list from my naive understanding?
To support Vision models on Cortex, we need the following:
- Download model - downloads .gguf and mmproj file -> What is the model pull UX?
- v1/models/start takes in model_path (.gguf) and mmproj parameters ✅
- /chat/completions to take in messages content image_url ✅
- image_url has to be encoded in base64 (via Jan, or link to tool eg https://base64.guru/converter/encode/image)
- model support - (side note: Jan currently supports BakLlava 1, llava 7B, Llava 13B) ..
- I'm not sure about this yet, since 1 folder can have multiple chat model files with 1 mmproj file.
- Yes
- I'm not sure if this is a good UX
- image_url can be a local path to image,
llama-cpp
engine support encoding image to base64 and pass it to model.llama-cpp
engine supports BakLlava 1, llava 7B, llava 13B.llama.cpp
upstream has already supportedMiniCPM-V 2.6
, we can integrate it tollama-cpp
.llama.cpp
upstream does not support llama3.2 vision yet.
We probably need to consider changing the UX for inferencing with vision model, for example:
cortex run llava-7b --image xx.jpg -p "What is in the image?"
Thank you @vansangpfiev and @hahuyhoang411! Quick notes from call:
Problem Statement
To support Vision models on Cortex, we need the following:
v1/models/start
takes inmodel_path
(.gguf) andmmproj
parameters/chat/completions
to take in messages contentimage_url
[ ] 5. model support - (side note: Jan currently supports BakLlava 1, llava 7B, Llava 13B)
1. Downloading model .gguf and mmprog file:
For fully compatible with Jan, cortex should be able to pull mmproj file along with GGUF file.
Let's take the image below for example.
Scenario steps:
.gguf
file) for user to select.mmproj
is also ended with.gguf
, we also listed that in the selection.So, we need to come up with a way so that cortex knows when to download the
mmproj
file along with traditional gguf file.cc @dan-homebrew , @louis-jan , @nguyenhoangthuan99, @vansangpfiev
Feature Idea
Couple of thoughts:
File name based. 1.1. For CLI: Ignore file name contains
mmproj
when presenting selection list. And download it along with selected traditional gguf file. 1.2. For API: Always scan the directory with same level as the URL provided. If there's ammproj
file name, cortex adds it to the download list.mmproj
file, return error with clear error message.Thinking / You tell me