Enable passing images along with text for models that support multimodal input

microsoft / TypeChat

TypeChat is a library that makes it easy to build natural language interfaces using types.

https://microsoft.github.io/TypeChat/

MIT License

8.25k stars 391 forks source link

Enable passing images along with text for models that support multimodal input #254

Closed hillary-mutisya closed 3 months ago

hillary-mutisya commented 4 months ago

GPT-4-vision, GPT-4-omni and GPT-4-turbo allow multi-modal input, where images and text can be passed in the prompt. To support this, the content section of the prompt has an array of objects instead of just a string.