Open rht opened 1 year ago
you are skipping a step in the middle of the paperqa implementation. the real paperqa calls openai multiple times.
the real paperqa calls openai multiple times.
Can these calls be summarized in few sentences? That it should be sufficient for me to have a "babyagi" implementation of Paper QA.
Really cool to be testing this! Very nice to see the side-by-side results.
I believe the difference may be the tree_summarize step in llamaindex - you need to set the prompt for that step to be what is present in the paper-qa summarize prompt. I'm not sure if/how it can be customized though
Thanks Andrew for providing some more thorough context :)
Yeah, it's the summarizing step that I think is helpful there. If you feed in irrelevant sources into the final prompt, you'll find that GPT sometimes does unexpected things. By using an intermediate filtering step, you can summarize the relevant facts from the citations and weed out any irrelevant facts. By inputting only summaries of the relevant facts in the final prompt, you substantially improve performance.
Now, there's some added cost to this (you're rerunning several API calls here) but for lots of use cases that added performance is worth it.
Richard Bradley Fuisz +1 (202) 340-2435 Fuisz.xyz Usually on the Pacific Time Zone. Usually living in San Francisco.
On Mon, Jul 10, 2023 at 5:43 AM, Andrew White < @.*** > wrote:
Really cool to be testing this! Very nice to see the side-by-side results.
I believe the difference may be the tree_summarize step in llamaindex - you need to set the prompt for that step to be what is present in the paper-qa summarize prompt. I'm not sure if/how it can be customized though
— Reply to this email directly, view it on GitHub ( https://github.com/whitead/paper-qa/issues/155#issuecomment-1628882608 ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AHHM4QGF54RBUYJFY2SLIBDXPP2GZANCNFSM6AAAAAA2BNSWKQ ). You are receiving this because you commented. Message ID: <whitead/paper-qa/issues/155/1628882608 @ github. com>
Hi @rfuisz now in v5 (released today) here: https://github.com/Future-House/paper-qa/blob/v5.0.0/paperqa/agents/tools.py
We expose our innards as essentially native Python objects. So you can import these into LlamaIndex, LangChain, or whatever, and use them. So a performance discrepancy should not really exist any more.
I am going to leave this open if you have any other questions, thanks for digging in deeply here
big congrats on the release!
This question is for pedagogical purpose. I tried to reproduce Paper QA's capability from scratch with LlamaIndex, without bells and whistle, as described in the FAQ:
But the answers created by Paper QA are still better than the LlamaIndex code I wrote based on the previous statement:
Are there more components that are missing? My code: