A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
If I use the GPT-4o model, is it capable of receiving input as an image? And if there is text in the image, could it be capable of extracting the text inside the picture? #1681
As the topic said, If I use gpt-4o, could it be possible to read the input image from user and generate the response based on the user input?
This issue is for a: (mark with an x)
- [ ] bug report -> please search issues before submitting
- [x ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)
Minimal steps to reproduce
None
Any log messages given by the failure
Expected/desired behavior
OS and Version?
Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
Minimal steps to reproduce
Any log messages given by the failure
Expected/desired behavior
OS and Version?
azd version?
Versions
Mention any other details that might be useful