SciSharp / LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
https://scisharp.github.io/LLamaSharp
MIT License
2.17k stars 293 forks source link

LLAVA Configuration #737

Closed hswlab closed 4 weeks ago

hswlab commented 1 month ago

Description

I have difficulties to figure out, how to correctly config the LLava example.

First I Initialized the backend with a path to libllama.dll and llava_shared.dll in NativeLibraryConfig.Instance.WithLibrary(llamaPath, llavaPath);

image

Then I tried to Implement something like shown in this example. I don't understand where to find the suitable Models I need for

string multiModalProj = UserSettings.GetMMProjPath();
string modelPath = UserSettings.GetModelPath();

image

modelPath, I belive is the model, I can download here. But what is a clipModel, and where can I get it?

SignalRT commented 1 month ago

You can see in Llama.Unitest.csproj the URLs of the models used in the example and UnitTest:

https://huggingface.co/cjpais/llava-1.6-mistral-7b-gguf/resolve/main/llava-v1.6-mistral-7b.Q3_K_XS.gguf https://huggingface.co/cjpais/llava-1.6-mistral-7b-gguf/resolve/main/mmproj-model-f16.gguf

You will have both files in any vision model. Example:

image
hswlab commented 1 month ago

Ah, thank you. So both models can be found at Huggingface. That's something completely new to me, usually I'm using just a single model^^'

SignalRT commented 1 month ago

Yes, you should download both files from the model you choose to use. Normally you will have several quantized models and one projection model

Llava is using CLIP with an MML Projection (mmproj).

You can find the details in this paper:

https://arxiv.org/pdf/2310.03744

AsakusaRinne commented 1 month ago

Maybe some documentations are necessary. :D