YifeiZhou02 / ArCHer

Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"
https://yifeizhou02.github.io/archer.io/
84 stars 10 forks source link

请问我们的工作中,对于webshop的训练有提前进行SFT模型么? #4

Closed xiaxiaxiatengxi closed 5 months ago

xiaxiaxiatengxi commented 5 months ago

请问我们的工作中,对于webshop的训练有提前进行SFT模型么?

YifeiZhou02 commented 5 months ago

Yes, it is in the provided google drive link in the README.

DZ9 commented 2 weeks ago

Is _gpt2_bc_workshophistory.pt the sft model for webshop? Does it means I need to load the model for training acther in rl step? How can I train a new sft model with different base LLM like Llama2? Thanks.