CsabaConsulting / InspectorGadgetApp

Open Multi-Modal Personal Assistant
MIT License
4 stars 1 forks source link

RAG: deal with semi duplicate retrievals #18

Open MrCsabaToth opened 2 months ago

MrCsabaToth commented 2 months ago

Let's say I ask every day about the weather. This will result in the RAG retrieving all the other requests although they won't contribute to the current day's query, furthermore they may push out and suppress other useful retrievals.

  1. We could computer classic similarity (Jaro-Winkler or edit-distance) and drop too similar ones
  2. Exclude older hits by date
  3. Unfortunately for the main ANN retrieval query these can still push out more useful hits from the scoop