Closed tju01 closed 1 year ago
Sorry for the confusion. When we performed evaluation, we started a local server for all api-based models, and took src.agents.HTTPAgent
as client. src.agents.api_agents
were created for convenience just before we released the code.
I see. I understand that the temperature should be 0 then. What would be the correct value for max_new_tokens
? 128, 256 or something else?
card game 512. else 128.
I am trying to make AgentBench work with some other models. However, it's not clear to me what temperature should be used for the agents. I can see that the fastchat agents use a temperature of 0:
https://github.com/THUDM/AgentBench/blob/d7dd9aefd28c40a1b4562dbd6f6e659a81cb7a94/configs/agents/fastchat_client.yaml#L7
However, any other agent like OpenAI agents don't seem to set the temperature, so it would just be the default of 1:
https://github.com/THUDM/AgentBench/blob/d7dd9aefd28c40a1b4562dbd6f6e659a81cb7a94/configs/agents/api_agents/gpt-3.5-turbo.yaml#L4
I saw that in your paper you wrote that you used a temperature of 0 for all tasks, but I can't actually find this in your code.
The same is true for the
max_new_tokens
which seems to be set to 128 for the fastchat models while no value is specified for the OpenAI chat models. A value seems to be specified for some other models, but it is 256 and not 128 which confuses me.