Azure-Samples / azure-search-openai-demo

A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
https://azure.microsoft.com/products/search
MIT License
5.93k stars 4.07k forks source link

Extract text from images embedded in pdfs #399

Open sivi3883 opened 1 year ago

sivi3883 commented 1 year ago

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

First of all, thanks for the awesome demo code. Works like a charm in the first attempt. Considering I am new to Azure and CLI commands, the instructions are super clear. I am trying to extract texts from the images embedded in PDFs. I believe I could add a skillset for OCR in the search service and rerun the index.

Looking for your recommendations on how to add it in prepdocs.py. Is there a way to extract the texts from images (as well from the supplied PDFs) through form recognizer?

Any log messages given by the failure

Expected/desired behavior

Use the backend chat application to perform on search text embedded in the images.

OS and Version?

Mac OS Monterey

azd version?

run azd version and copy paste here.

Versions

azd version 1.0.2

Mention any other details that might be useful


Thanks! We'll be in touch soon.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this issue will be closed.

pamelafox commented 1 year ago

This is still a valid feature request and I have heard it a few times. I have not looked into this myself, however. Leaving it open.

KellyChesco commented 1 year ago

Thanks for leaving this open as I just received a request for something like this, and apparently the images are microfiche scans in PDFs to further complicate things.

github-actions[bot] commented 9 months ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this issue will be closed.

egor-yudkin commented 3 months ago

I think that the current DI chunking process already does that. I can see text from embedded images in the index of app we are building. @pamelafox is this somethin you can confirm?