retrieval augmented ChatGPT

bruffridge commented 1 year ago

Use retrieval methods (eg. vector embeddings similarity, or BM25) to find information from journal articles to provide useful context to improve the outputs from ChatGPT for the Discover and Abstract steps of BDA mode.

Use of existing API based search capabilities are preferred rather than building one from scratch if they work well. Available search APIs are listed here: https://github.com/nasa-petal/PeTaL/wiki/BID-resources

Brief description of retrieval augmentation: https://youtu.be/bZQun8Y4L2A?t=1974

Embeddings for all the papers in Arxiv.org https://alex.macrocosm.so/download

Medium article showing how to use some existing tools that can do something similar: https://medium.com/a-academic-librarians-thoughts-on-open-access/using-large-language-models-like-gpt-to-do-q-a-over-papers-ii-using-perplexity-ai-15684629f02b

https://github.com/openai/openai-cookbook/blob/main/examples/Question_answering_using_embeddings.ipynb

This may be helpful: https://twitter.com/danshipper/status/1615901860786749440 He used this to load his reference library: https://github.com/jerryjliu/gpt_index

This is another example that may be useful: https://every.to/superorganizers/i-trained-a-gpt-3-chatbot-on-every-episode-of-my-favorite-podcast

bruffridge commented 11 months ago

Three uses for retrieval augmentation, to help improve bdamode.

Discover additional relevant biological models that ChatGPT didn't generate on its own.
Find accurate and relevant sources to include for each biological strategy, rather than relying on ChatGPT to generate them (since it has been shown to get this wrong sometimes).
Provide context to help ChatGPT generate more accurate and detailed descriptions of how the biological strategies work.

bruffridge commented 11 months ago

from @hschilling

For Better Answers, Generate Reference Text If you want a model to answer questions correctly, then enriching the input with reference text retrieved from the web is a reliable way to increase the accuracy of its output. But the web isn’t necessarily the best source of reference text. What's new: Wenhao Yu at University of Notre Dame and colleagues at Microsoft and University of Southern California used a pretrained language model to generate reference text. They fed that material, along with a question, to a second pretrained language model that answered more accurately than a comparable model that was able to retrieve relevant text from the web. Key insight: Given a question, documents retrieved from the web, even if they’re relevant, often contain information that doesn’t help to answer it. For instance, considering the question “How tall is Mount Everest?,” the Wikipedia page on Mount Everest contains the answer but also a lot of confusing information such as elevations attained in various attempts to reach the summit and irrelevant information that might distract the model. A language model pretrained on web pages can generate a document that draws on the web but focuses on the question at hand. When fed to a separate language model along with the question, this model-generated reference text can make it easier for that model to answer questions correctly. How it works: The authors used a pretrained InstructGPT (175 billion parameters) to generate reference text related to questions in trivia question-answer datasets such as TriviaQA. They generated answers using FiD (3 billion parameters), which they had fine-tuned on the dataset plus the reference text. (A given question may have more than one valid answer.) InstructGPT generated reference text for each question in the dataset based upon a prompt such as, “Generate a background document to answer the given question,” followed by the question. The authors embedded each question-reference pair using GPT-3 and clustered the embeddings via k-means. At inference, the system randomly selected five question-reference pairs from each cluster — think of them as guide questions and answers. For each cluster, given an input question (such as, "What type of music did Mozart compose?") and the question-reference pairs, InstructGPT generated a document — information related to the question. Given the question and documents, FiD generated an answer. (Valid answers to the Mozart question include, "classical music," "opera," and "ballet.") Results: The authors evaluated their fine-tuned FiD on TriviaQA according to the percentage of answers that exactly matched one of a list of correct answers. Provided with generated documents, FiD answered 71.6 percent of the questions correctly compared to 66.3 percent for FiD fine-tuned on TriviaQA and provided with text retrieved from Wikipedia using DPR. Yes, but: The authors’ approach performed best (74.3 percent) when it had access to both Wikipedia and the generated documents. While generated documents may be better than retrieved documents alone, they worked best together. Why it matters: Good reference text substantially improves a language model’s question-answering ability. While a relevant Wikipedia entry is helpful, a document that’s directly related to the question is better — even if that document is a product of text generation. We're thinking: Your teachers were right — Wikipedia isn’t the best source

nasa-petal / bidara

retrieval augmented ChatGPT #2