YifeiZhou02 / ArCHer

Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"
https://yifeizhou02.github.io/archer.io/
84 stars 10 forks source link

pre-trained weights for re-producing results #1

Closed sufengniu closed 6 months ago

sufengniu commented 6 months ago

Thank you for the great work, just wondering if you have pre-trained GPT2 weights or Mistral weights released so that I could reproduce results without re-training the model? I know the training script is there already, just wondering, it is fine if you don't want to provide it. Thank you!

YifeiZhou02 commented 6 months ago

Hi, thanks for being interested in our work! Pre-trained BC checkpoints are provided in the google drive link https://drive.google.com/drive/folders/1pRocQI0Jv479G4vNMtQn1JOq8Shf2B6U?usp=sharing (also in the README). Unfortunately I was not able to save all the checkpoints for the experiments, but most experiments should be able to be replicated within 2-3 days with parallelism on 4 gpus.