matlab-deep-learning / llms-with-matlab

Connect MATLAB to LLM APIs, including OpenAI® Chat Completions, Azure® OpenAI Services, and Ollama™
Other
108 stars 23 forks source link

Ollama images #59

Closed ccreutzi closed 3 months ago

ccreutzi commented 3 months ago

As for the image encoding, Azure expects messages like this:

{ 
    "messages": [ 
        { "role": "system", "content": "You are a helpful assistant." }, 
        { "role": "user", "content": [  
            { 
                "type": "text", 
                "text": "Describe this picture:" 
            },
            { 
                "type": "image_url",
                "image_url": {
                    "url": "<image URL>"
                }
            }
        ] } 
    ], 
    "max_tokens": 2000 
} 

where <image URL> can be a “normal” URL like "https://somewhere.tld/filepath.png" or an inline URL like "data:image/jpeg;base64,{base64_image}".

Multiple images can be in the same "content" array or in multiple "user" messages, experiments say. With this change, we are sending them in one "user" message.

The image encoding for OpenAI is the same as for Azure.

Ollama wants to get images in a different format:

{
  "model": "llava",
  "messages": [
    {
      "role": "user",
      "content": "what is in this image?",
      "images": ["{base64_image}"]
    }
  ]
}