Closed nicolaerusan closed 10 months ago
is anyone on it? it shouldn't be that hard, just a modification of the types. output stream is the same
edit: for those searching for solution try: // @ts-ignore
I'm definitely interested in this, as well. I do use a combination of the AI SDK and the OpenAI SDks and it works well, but things are changing fast and I think it's better to have a community work on a single tool (AI SDK?) to make everything easier to work with. Hope Vercel can allocate more resources to this as this is definitely the future of app development.
We’re actively investigating supporting vision and the other new features.
it shouldn't be that hard, just a modification of the types
I’d rather introduce a breaking change when we have more to offer than just a type change. There’s likely more involved changes we can make to improve the developer experience.
Thanks to #725 by @lgrammel, you can now send images in the data
field in react/use-chat in order to interact with the vision API. See this example: https://github.com/vercel/ai/blob/main/examples/next-openai/app/api/chat-with-vision/route.ts. Docs will be coming soon
@nicolaerusan, great job; here's an example for this feature: https://github.com/marcusschiesser/ai-chatbot - would be great if you could also add images to the messages; see https://github.com/vercel/ai/issues/768
Hey @MaxLeiter do you know if the image being part of the returned message is being worked on as @marcusschiesser asked above? It would be super helpful to keep the image in the message array, especially knowing there is going to be more and more multi modals models that will come out in the coming months. @lgrammel
Technically once the completion of the new message is done, we could do something like this inside the onFinish callback. Only issue is the wrong typing...
setMessages([
...messages,
{
...messages[messages.length - 1],
content: [
{ "type": "text", "text": "What’s in this image?" },
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
},
},
],
},
])
`
@marclelamy the code generated by create-llama
is now supporting displaying sent images in the messages
Does Vercel offer OpenAI vision integration
@taylor-lindores-reeves check out multimodal prompts: https://sdk.vercel.ai/docs/foundations/prompts#multi-modal-messages
Feature Description
GPT vision just came out. We're looking to integrate a chat experience that we would like to leverage these features as well but ideally we could leverage this SDK for it.
https://platform.openai.com/docs/guides/vision
Related to this, would be great to get assistants in the SDK too, but that could be a separate issue :) https://platform.openai.com/docs/assistants/overview
Use Case
Be able to select 3-4 images and send a message including those images
Additional context
No response