eth-sri / lmql

A language for constraint-guided and efficient LLM programming.
https://lmql.ai
Apache License 2.0
3.65k stars 197 forks source link

Vision Support #266

Open ambroser53 opened 11 months ago

ambroser53 commented 11 months ago

Is there any plans to integrate images input into LMQL? With the new GPT-4V and open-source lightweight vision language models such as MPlug-Owl it would be incredibly useful. I work with MPlug-Owl quite a lot so would be happy to investigate this if I could be pointed in the right direction on where to start. A discussion for how they typically create multi-modal prompts in open-source models could be helpful in getting it working for more than just GPT-4.

lbeurerkellner commented 11 months ago

We definitely want to support this. They have already been some discussion on Discord surrounding this. See the #dev channel for more details.