Blaizzy / mlx-vlm

MLX-VLM is a package for running Vision LLMs locally on your Mac using MLX.
MIT License
542 stars 50 forks source link

Support Chat API #111

Open madroidmaq opened 2 weeks ago

madroidmaq commented 2 weeks ago

Hope to support the API style for Chat (friendly to multi-turn conversations), currently it seems to be nearly supporting generate.

For example, in a similar way to the code example like Qwen/Qwen2-VL-2B-Instruct as follows:

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg",
            },
            {"type": "text", "text": "Describe this image."},
        ],
    }
]

# apply messages
...
Blaizzy commented 2 weeks ago

You can do this manually.

I will later check if there is a great way to integrate this. The problem is that not all models support this format.

Blaizzy commented 2 weeks ago

Also we have a multi-turn example here:

https://github.com/Blaizzy/mlx-vlm/issues/68#issuecomment-2440233679

madroidmaq commented 2 weeks ago

Also we have a multi-turn example here:

#68 (comment)

Thanks, this example worked for me.

Blaizzy commented 2 weeks ago

My pleasure!