Textualization / the-ragged-edge-box

RAGGED EDGE BOX: Your Personal AI-Powered Document Search System
Other
14 stars 2 forks source link

Answer extraction without LLMs #25

Open DrDub opened 3 months ago

DrDub commented 3 months ago

While this projects contains the SOTA in terms of embeddings and local LLMs, the answer extraction is too slow.

It might be possible to use an answer extraction system based on RNNs (like Mamba) trained on the output of a local LLM.

This is a research endeavour.

damian-barsotti commented 2 months ago

Hi. You propose to train Mamba on the output of the local LLM. What would be the input in this training?

DrDub commented 2 months ago

The input could be a document set representative of the overall documents being used with RAGged Edge Box. The queries can be generated from the documents themselves.