turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.2k stars 235 forks source link

Incorporate RAG with Exllamav2 #387

Closed insanesac closed 3 months ago

insanesac commented 3 months ago

Hi,

I want to integrate RAG with exllamav2. I could not find any relevant stuff online and hence asking here. I have a langchain based implementation for RAG which runs fine with hf and llama_cpp_python, but i want to use exllamav2 instead. Any suggestion/link is welcome.

johnwick123f commented 3 months ago

@insanesac either you could use this https://github.com/langchain-ai/langchain/issues/8385#issuecomment-1953333714

Or implement your own RAG system which might be better? but i suppose you strictly want to use langchain and so you should use the above link.

insanesac commented 3 months ago

@johnwick123f Well i dont have any strict requirements about using langchain. I just want to setup a RAG based usecase. The above link did help a lot and i was able to use add RAG to it. Thanks a lot for the help, it was a lifesaver