microsoft / autogen

A programming framework for agentic AI 🤖
https://microsoft.github.io/autogen/
Creative Commons Attribution 4.0 International
31.29k stars 4.57k forks source link

[Issue]: Agentic Loop over a large document #2542

Closed ismailsimsek closed 1 month ago

ismailsimsek commented 5 months ago

Describe the issue

Is it possible to read large PDF document in chunks using agents. Without programmatic loop.

Could it be done using task-decomposition? have anyone done something similar?

Steps to reproduce

something like below:

  1. Agents uses tool and reads PDF document 50 pages each time (total 300 pages)
  2. Agents summarizes all the chunks to 6 page output
  3. Summary written to file

Screenshots and logs

No response

Additional Information

No response

WaelKarkoub commented 5 months ago

@ismailsimsek are you asking for an OCR capability or a rag capability? If OCR, I believe it's planned in the multimodality road map https://github.com/microsoft/autogen/issues/1975

ismailsimsek commented 5 months ago

Currently trying to get it work with RAG ( RetrieveUserProxyAgent + GroupChatManager )

Appreciate if anyone could point to similar solutions..

Current code is here: https://github.com/ismailsimsek/aistorybooks/blob/story-book/classic_storiesv2.py PR https://github.com/ismailsimsek/aistorybooks/pull/3

currently just trying to summarize PDF, later on planning to add image generation too

thinkall commented 4 months ago

Currently trying to get it work with RAG ( RetrieveUserProxyAgent + GroupChatManager )

Appreciate if anyone could point to similar solutions..

Current code is here: https://github.com/ismailsimsek/aistorybooks/blob/story-book/classic_storiesv2.py PR ismailsimsek/aistorybooks#3

currently just trying to summarize PDF, later on planning to add image generation too

The current RetrieveUserProxyAgent should support PDF files. Have you tried it?

ismailsimsek commented 4 months ago

@thinkall i will check it. what i am looking into is summarizing the PDF in small chunks, since its too big. in a loop, is that possible using the agents to loop and process chunks one by one?

thinkall commented 4 months ago

@thinkall i will check it. what i am looking into is summarizing the PDF in small chunks, since its too big. in a loop, is that possible using the agents to loop and process chunks one by one?

The agent will split the pdf into chunks and save it into vector db.

thinkall commented 1 month ago

Close as it's not active for a long time. Please reopen if the issue still persist.