Closed paehal closed 5 months ago
I don't know what happens if you use GPT-3.5, but I believe I mostly observed the agents arguing with each other in this example.
We are hoping to be able to add example outputs for specific language models in the future.
Hi again, we just added an example of the experiment output here:
Thank you for the wonderful project and for making the code publicly available.
As I am unfamiliar with projects involving LLM multiagents, this may be a simple question, but I would like to ask nonetheless.
I am testing with GPT-3.5 using the 'three_key_questions.ipynb'. In this setup, Alice is supposed to have a conversation apologizing to Bob. However, in my three trials, they(4agents) just keep playing various games lol. Even extending the 'episode_length' to 10 didn't change this outcome.
The technical report (https://arxiv.org/abs/2312.03664) didn't include the results of this experiment, so could you please let me know if this is replicating your results? (Including the version of GPT used).
I understand that, unlike typical multiagent reinforcement learning code, it might be difficult to replicate results in this case.
Sincerely,