YifeiZhou02 / ArCHer

Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"
https://yifeizhou02.github.io/archer.io/
84 stars 10 forks source link

这个工作可以直接用于LLama2么 #2

Closed xiaxiaxiatengxi closed 6 months ago

YifeiZhou02 commented 6 months ago

Hi,

Thanks for your interest in our work. We have existing scripts for running Mistral-7B. To use LLama2, simply make a different prompt template as in here, and change the policy lm here. We found the performance of LLama2-7B to be much worse than Mistral-7B. We did not test LLama2 with more parameters.