Arvid-pku / Godel_Agent

Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement
22 stars 8 forks source link

I want to run it with a GROQ LLM; what and where to change code? #1

Open jbdatascience opened 6 days ago

jbdatascience commented 6 days ago

I want to run it with a GROQ LLM; what and where to change code?

Besides the python code I think also the prompts would probably be changed (src/goal_prompt.md).

Especially I would like to do experiments with the "Game of 24" example!

Must the LLM have the capability for tool use / function calling? Which GROQ models could be used? See:

I think this GROQ model is a good candidate: Llama 3 Groq 70B Tool Use (Preview) Model ID: llama3-groq-70b-8192-tool-use-preview Developer: Groq Context Window: 8,192 tokens Model Card

Is the context window large enough (8192 tokens) ?

I hope you can answer these questions, because I really love the GROQ LLMs !

Arvid-pku commented 5 days ago

Thank you for your interest in our project! To modify the code for running it with a GROQ LLM, you'll primarily need to update the API calls to switch from OpenAI's API to the model of your choice. This should be the main adjustment needed (function calling isn't needed, but you may need to implement the execute part manually). We definitely appreciate contributions from the community to make the Gödel agent more versatile, so feel free to submit a pull request if you'd like to help us improve this functionality.

However, it's important to note that self-improvement requires a highly capable LLM, one that can understand its own code and suggest meaningful enhancements. While models like Llama might be able to perform certain tasks, they may not be powerful enough for this type of self-referential improvement. Additionally, a context window of 8192 tokens might not be sufficient for more complex tasks, though you could try reducing the length of history to better fit within this limit.

As for modifying the goal_prompt, it is indeed possible to allow the agent to change it. That said, preventing it from doing so is also straightforward if you have security concerns. So far, we haven't observed any behavior from the current LLM that suggests modifying the goal_prompt, but security remains a key focus for our future work. We welcome anyone interested in contributing to this area as well.

Regarding the "Game of 24" experiment, running it would simply require implementing task inputs and an evaluator. The same approach can be applied to any other task you wish to experiment with. However, due to our current workload, we haven't yet organized that portion of the code.

jbdatascience commented 5 days ago

Thank you for your interest in our project! To modify the code for running it with a GROQ LLM, you'll primarily need to update the API calls to switch from OpenAI's API to the model of your choice. This should be the main adjustment needed (function calling isn't needed, but you may need to implement the execute part manually). We definitely appreciate contributions from the community to make the Gödel agent more versatile, so feel free to submit a pull request if you'd like to help us improve this functionality.

However, it's important to note that self-improvement requires a highly capable LLM, one that can understand its own code and suggest meaningful enhancements. While models like Llama might be able to perform certain tasks, they may not be powerful enough for this type of self-referential improvement. Additionally, a context window of 8192 tokens might not be sufficient for more complex tasks, though you could try reducing the length of history to better fit within this limit.

As for modifying the goal_prompt, it is indeed possible to allow the agent to change it. That said, preventing it from doing so is also straightforward if you have security concerns. So far, we haven't observed any behavior from the current LLM that suggests modifying the goal_prompt, but security remains a key focus for our future work. We welcome anyone interested in contributing to this area as well.

Regarding the "Game of 24" experiment, running it would simply require implementing task inputs and an evaluator. The same approach can be applied to any other task you wish to experiment with. However, due to our current workload, we haven't yet organized that portion of the code.

Thank you for your quick response! I am very interested in this project, so I will certainly look into it. But it may take some time ... I will keep you informed if I get results with GROQ LLMs!