Vision Support - Githubissues

eth-sri / lmql

A language for constraint-guided and efficient LLM programming.

https://lmql.ai

Apache License 2.0

3.72k stars 202 forks source link

Vision Support #266

Open ambroser53 opened 1 year ago

ambroser53 commented 1 year ago

Is there any plans to integrate images input into LMQL? With the new GPT-4V and open-source lightweight vision language models such as MPlug-Owl it would be incredibly useful. I work with MPlug-Owl quite a lot so would be happy to investigate this if I could be pointed in the right direction on where to start. A discussion for how they typically create multi-modal prompts in open-source models could be helpful in getting it working for more than just GPT-4.

lbeurerkellner commented 1 year ago

We definitely want to support this. They have already been some discussion on Discord surrounding this. See the #dev channel for more details.