Closed songsey closed 1 month ago
Hi, for Openai, you can just replace the generation function with the Openai'chat API call.
ok, can you tell me why qa_llama.py doesn't generate response using transformers pipeline ? for example, "pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, ) " qa_llama.py doesn't seem to extend to other models if we use model.generate method.
Yes. You can also use that method. qa_llama.py is just an example for generating answers.
I see. lastly, it will be good if I can do retrieval and generation at once. I need to retrieve first to output and then generation right ?
Actually, you can adapt the process to suit your needs. Since various RAG frameworks employ different retrieval methods, this repository serves merely as a basic demonstration of how to utilize this dataset, which is also featured in the paper showcasing the performance of a simple RAG framework.
In summary, you can combine the two processes if applicable. And I am also looking forward to seeing the new RAG framework. 😊
I want to reproduce table 6 (generation accuracy). Can you tell me how to do this with current code ?
I'm particularly interested in openai gpt results. GPT4 seems to be quite a high baseline already, for "inference query" type questions, w/o RAG approach.
To do this, do you need to provide openai version of "qa_llama.py ?"