superagent-ai / reag

Reasoning Augmented Generation
MIT License
706 stars 46 forks source link

Number your segments in docs and prompt LLM to output numbers of relevant segments instead of actual relevant text to increase speed and reduce cost and increase extraction quality #9

Open shreyas-shinde opened 1 week ago

shreyas-shinde commented 1 week ago

You can check the how to here https://www.reddit.com/r/MachineLearning/comments/17k6iha/d_relevance_extraction_in_rag_pipelines/?share_id=j0imtSJRhwS2Jz9gDYplj&utm_content=2&utm_medium=ios_app&utm_name=ioscss&utm_source=share&utm_term=1

shreyas-shinde commented 1 week ago

FYI: I have actually tried the numbered approach and have seen cost, latency go down because of output token reduction and also found that the eval score of downstream task (QA in my case) went up.

homanp commented 1 week ago

@shreyas-shinde feel free to contribute!

homanp commented 1 week ago

@shreyas-shinde So is the process to:

  1. Have the LLM Segment the docs
  2. Have the LLM extract relevant segments
shreyas-shinde commented 1 week ago

@homanp

  1. Segment the docs -> not done by LLM simply some python code. A reference algo can be https://github.com/langroid/langroid/blob/main/langroid/parsing/utils.py#L135 . Maybe we can give an option to user if they want to use LLM for this.
  2. LLM extracts relevant segment numbers given the context (this could be split across multiple parallel LLMs given the context length) ref prompt https://github.com/langroid/langroid/blob/main/langroid/agent/special/relevance_extractor_agent.py#L75
  3. Get the segment text based on the numbers in 2 by simple regex ref: https://github.com/langroid/langroid/blob/main/langroid/parsing/utils.py#L296