matrixgpt / matrix-chatgpt-bot

Talk to ChatGPT via any Matrix client!
GNU Affero General Public License v3.0
237 stars 64 forks source link

Support `gpt-4-vision-preview` #247

Open PaarthShah opened 1 year ago

PaarthShah commented 1 year ago

https://platform.openai.com/docs/guides/vision

It seems like uploading base64-encoded images may be a generic viable strategy for passing images through the API.

Alternatively/for speed, and from unencrypted rooms, it may instead be possible/desirable to pass an image URL by transforming the image mxc url to an https url via the image_url key.

max298 commented 1 year ago

As far as I can tell we're limited by the library we use for API communication, which does not yet support vision. Although I'm very interested and will check what we can do as soon as the library adds support.

Dual-0 commented 1 year ago

I open up a request.

max298 commented 1 year ago

I think we might consider dropping the third party SDK and switch to the official node package from openai: https://github.com/openai/openai-node#readme which seems to support vision

PaarthShah commented 1 year ago

Going for the official node library seems like the best option for long-term sustainability and rapid adoption of new features

bertybuttface commented 1 year ago

Yes but we would then be responsible for handling context, which is fine if someone is willing to write the code.