Open InAnYan opened 1 month ago
1) Stuffing: just put the whole contents of document into user message 2) Map-reduce: summarize small chunks of text and combine these summarizations into one 3) Refine: (not really sure about this one), summarize a small chunk of text, then give the AI this summarization + new small chunk and ask it to refine the current summarization, and do this recursively, 4) Only detect and embed the "Abstract" in the PDF
I do not think this is a low priority issue. It is a normal to high priority issue, as summarization is the main goal of this GSoC project. Though, it is an issue that may not be important in a particular week. So, instead of using the "priority" labels to check if you should give it attention right now, just use the milestone table to check, if you should give it attention this week or not. Otherwise you will eventually face a situation where you constantly have to change labels on issues and PRs and assign them high or low or normal priority depending on the day, week, month or year you are in.
Ah, I forgot, sorry, yes.
I used low-priority
to filter out issues for Week 1, but then I realized here are Milestones
on GitHub, so we don't need low-priority
It is ok to differentiate between low, normal or high priority. The more issues there are, the more it is necessary to prioritize.
https://github.com/GoogleCloudPlatform/generative-ai/blob/main/language/use-cases/document-summarization/summarization_large_documents_langchain.ipynb