TanGentleman / Augmenta

Automate RAG-powered workflows
MIT License
1 stars 0 forks source link

Implement neighboring-chunk context additions #17

Open TanGentleman opened 7 months ago

TanGentleman commented 7 months ago

This is helpful particularly in a use case involving a transcript with small chunk sizes.

Goal: User query/description -> appropriate context from movie script

  1. Chunk the document using chunk_size=256, chunk_overlap=50
  2. User query is "What happens after Johnny hands the knife to the killer?"
  3. k_excerpts = 1
  4. LLM chain designed for eval checks if the excerpt is relevant. If not relevant, reply "not found from context"
  5. If found, grab the nearest chunk(s) before and after the relevant excerpt for additional context and append to the context for the rag llm call.

The neighboring chunks will require a new metadata entry, (e.g. "chunk" or "index" with as a simple integer value.). Then I should be able to do set search_kwargs["filter"] with a check for the chunk index to pull the surrounding context of the pdf from the vectorstore.

TanGentleman commented 7 months ago

Seems like I can approach this with a separate chain besides the typical rag chain. This is what I'm thinking:

retriever + LLM + query

-> retriever.get_relevant_documents with search_kwargs key "k" = 1 to get a single excerpt. -> Call an evaluation chain on this excerpt to see if it relevant to the query -> If it fails (1. try again or 2.abort process) -> If eval is successful, grab the index integer from the metadata of the excerpt Document. -> search_kwargs key "filter" = {"index": index-1} (or +1, or both, likely depends on the particular case) -> consolidate these excerpts together into a List[Document] that go into format_docs in the chain as usual -> These excerpts have stricter relevance and are passed as context to give a high quality user-facing response

Implementation will likely mean writing a few of my own functions that define the search_kwargs dict and call retriever.get_relevant_documents. Before I build it too specifically, I want to find examples (like movie transcripts) where the method has significantly improved performance.