Additional Support for Vision Based Models Like Llava?

guinmoon / LLMFarm

llama and other large language models on iOS and MacOS offline using GGML library.

https://llmfarm.site

MIT License

1.05k stars 62 forks source link

Additional Support for Vision Based Models Like Llava? #44

Closed dataslayermedia closed 3 months ago

dataslayermedia commented 4 months ago

I've attempted to get this working with the out-of-the-box Llava models providing local URLs, remote URL's, and base64 encoded images to no avail. The model runs and chats but not sure how best to feed images to it...

guinmoon commented 4 months ago

Hi, I plan to add support for multimodal models at a later date.

guinmoon commented 3 months ago

Done.