mlc-ai / web-llm-chat

Chat with AI large language models running natively in your browser. Enjoy private, server-free, seamless AI conversations.
https://chat.webllm.ai/
Apache License 2.0
225 stars 40 forks source link

[Feature Request]: RAG #52

Open scorpfromhell opened 1 month ago

scorpfromhell commented 1 month ago

Problem Description

If question & answers/FAQs/documentations are made available in local storage or indexdb, they can be used for doing retrieval augmented generation.

Solution Description

Currently the chat responses are based only on the data provided during the pre-training. That data might either be outdated or insufficient. To overcome that RAG can be considered.

Content can be either stored locally or can be fetched from search engines or specified sites using tool calling.

Content can be stored locally in:

  1. local storage if curated question-answer pairs exist in limited quantity
  2. indexdb if curated question-answer pairs exist in a large number that can't fit into local storage
  3. Voy, a WASM based vector db, can be used to store the content after embedding.

The content can either be uploaded from local files or synced via REST API (can be provided via settings or a button next to prompts).

Retrieval can be done using elasticlunr.js for plain text and Transformers.js for semantic search in case of embeddings.

Alternatives Considered

Something similar has been done in https://github.com/jacoblee93/fully-local-pdf-chatbot

But it does not allow:

  1. Persistence of content on which RAG needs to be done
  2. Synchronisation of locally stored content via REST API
  3. Retrieving content from the Internet (tool calling)

Additional Context

No response