Closed barsuna closed 3 months ago
Hey @barsuna. I was searching how to use llama with gpt researcher and stumbled upon this post. If possible, could you tell me how to get gpt researcher to work with llama 3?
@Dilip-17 there was same question on another issue, i added some pointers there
https://github.com/assafelovic/gpt-researcher/issues/520
the challenge is mostly not how to run, but having the gpu memory necessary to run llama3 - even the borderline usable (imo, opinions are divided on this) 4-bit quantized 70b model takes about ~43GB, i'd recommend Q6 which is close to 60GB
Hey working with different LLMs (other than the default OpenAI) required extra manual tweaking. Would love to learn from your experience if you find ways to make the code more generic!
To its credit, llama3 worked pretty much out of box with gpt-researcher (the only tweak needed was the prompt change above). It seems it is possible to stretch the context window to 16k without tuning (though i've done very limited testing of that).
So far progress with llama3 was difficult for things requiring function calling and in-prompt memory - autonomous agents, with single or 1 by 1 prompting agents things seem to be better.
Of course the main challenge remains the quality of reports, i'm currently trying to compare llama3 vs gpt4, it seems both are challenged somewhat and my belief is the likely direction to solve this is to balance automation/augmentation - let user do more if they wished.
Havent measured quality of embeddings and its impact on quality of report much either.
Great thank you for the feedback @barsuna ! Closing for now but feel free to open new threads if needed
Testing gpt-researcher with llama3, i found that 3 times out of 4 llama3 will respond with json + some verbiage to prompt in generate_search_queries_prompt.
Not sure it is worth changing the prompt for sake of llama3 alone, but for documentation purposes here is the updated prompt that seems to work every time
before:
after