microsoft / sample-app-aoai-chatGPT

Sample code for a simple web chat experience through Azure OpenAI, including Azure OpenAI On Your Data.
MIT License
1.43k stars 2.14k forks source link

Can we handle images in PDFs? #883

Open shikha1970 opened 1 month ago

shikha1970 commented 1 month ago

Is your feature request related to this sample app, or to an Azure service, such as Azure OpenAI or Azure AI Search? It is related to this sample APP

Is your feature request related to a problem? Please describe. If PDF has images, i would want that content to be vectorized as well.

Describe the solution you'd like If there are images, I would want the image to be extracted as a separate chunk. The description of image to be retreived using GPT4o. This description can be vectorized in the normal way. In the prompt, we want to modify such that it includes a placeholder for the image if it leverages an image type knowledgebace. When displaying the final response, we place the image in place of placeholder. Is this feature specific to your use case or your organization, or would it apply broadly across other uses of this app? This will make our app multimodal. We can also think from the perspective that GPT4o will have the ability to create images. So how will we want to display them.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.