Support for multimodal models

Hi there. Thank you for This cool binding. I am using it on lollms and it is the fastest one for local models without services on par with ollama but with way less latency.

I just wonder if there is support for multimodal models like llava. That would be great to have an example of code. I can add it in the lollms multimodal workflow making it really the coolest.

For now it is just for text generation.