stanford-oval / WikiChat

WikiChat is an improved RAG. It stops the hallucination of large language models by retrieving data from a corpus.
https://wikichat.genie.stanford.edu
Apache License 2.0
972 stars 93 forks source link

Extending WikiChat dataset to arXiv + more #9

Closed emrgnt-cmplxty closed 8 months ago

emrgnt-cmplxty commented 8 months ago

I have been working to build an open source search engine for LLMs. It was the inability to control hallucinations in synthetic data creation that led me to want to work on this. I just stumbled upon this paper today and found the magnitude of the improvement quoted to be very impressive.

The dataset has fully embedded 1.6m Arxiv papers - I'm interested in using this to the WikiChat your approach to include Arxiv and more.

I'm posting here to see if there are any interested collaborators.

Best, Owen

trivikramak commented 8 months ago

It would be great to have a chat interface for scientific knowledge without the worry of hallucinations!

emrgnt-cmplxty commented 8 months ago

Pretty impressed with the responses after hooking up the AgentSearch API, adding a video below.

User: "What are some of the most important advances in quantum field theory?"

Chatbot: Some recent developments in quantum field theory include exact results for anomalous dimensions of operators in $\mathcal{N}=4$ super Yang-Mills theories and the discovery of new symmetries in quantum field theories. There have also been advancements in non-perturbative quantum field theory, which have improved our understanding of non-perturbative phenomena in the quantum theory of fields and strings. However, there is still a lack of complete mathematical understanding and a need for practical tools to perform QFT calculations. Additionally, there have been ongoing developments in the use of quantum field theory in quantum computing and quantum information theory.

User: "What did A. Sen have to say about this recently"?

Chatbot: A. Sen has given talks and written papers discussing recent developments in quantum field theories, including exact results for anomalous dimensions of operators in $\mathcal{N}=4$ super Yang-Mills theories. His work has also contributed to our understanding of non-perturbative phenomena in the quantum theory of fields and strings.

https://github.com/stanford-oval/WikiChat/assets/68796651/e084e1c1-67ad-46b7-ba8c-8fa749a1a7c1