matrixgpt / matrix-chatgpt-bot

Talk to ChatGPT via any Matrix client!
GNU Affero General Public License v3.0
235 stars 63 forks source link

Support `gpt-4-vision-preview` #247

Open PaarthShah opened 11 months ago

PaarthShah commented 11 months ago

https://platform.openai.com/docs/guides/vision

It seems like uploading base64-encoded images may be a generic viable strategy for passing images through the API.

Alternatively/for speed, and from unencrypted rooms, it may instead be possible/desirable to pass an image URL by transforming the image mxc url to an https url via the image_url key.

max298 commented 11 months ago

As far as I can tell we're limited by the library we use for API communication, which does not yet support vision. Although I'm very interested and will check what we can do as soon as the library adds support.

Dual-0 commented 11 months ago

I open up a request.

max298 commented 11 months ago

I think we might consider dropping the third party SDK and switch to the official node package from openai: https://github.com/openai/openai-node#readme which seems to support vision

PaarthShah commented 11 months ago

Going for the official node library seems like the best option for long-term sustainability and rapid adoption of new features

bertybuttface commented 11 months ago

Yes but we would then be responsible for handling context, which is fine if someone is willing to write the code.