Closed xxSpencer closed 1 year ago
That example is a quick demo to demonstrate how the model output looks like, l and for simplicity, we don't show the full implementation of Self-RAG, as noted in README :) Update: I made the note bold to avoid confusion.
Currently, the full implementation multi-step generations for pre-given retrieved documents are at run_long_form_static.py. We are also planning to release the refactored implementations with memory-efficient dynamic retrieval -- our original implementations require 100+GB RAM at inference time and we wanted to reduce the computational requirements.
Great! I will study the py file. Thank you for your reply.
Dear author, I think this project is great and has done some very interesting work, but I have a point of confusion.
I am confused about how to implement "Critique outputs and select best segment". In the code:
it seems that the highest relevance document is selected by using the [0] element of the return value from the retriever and directly generating an answer based on that document using LLM, instead of generating segments in parallel with LLM and then critiquing and selecting the best one all by LLM.