Hi there. Thank you for This cool binding. I am using it on lollms and it is the fastest one for local models without services on par with ollama but with way less latency.
I just wonder if there is support for multimodal models like llava. That would be great to have an example of code. I can add it in the lollms multimodal workflow making it really the coolest.
Hi there. Thank you for This cool binding. I am using it on lollms and it is the fastest one for local models without services on par with ollama but with way less latency.
I just wonder if there is support for multimodal models like llava. That would be great to have an example of code. I can add it in the lollms multimodal workflow making it really the coolest.
For now it is just for text generation.![image](https://github.com/turboderp/exllamav2/assets/827993/553d5ba8-6992-4ec9-acbf-67e33949e51a)