THUDM / AgentTuning

AgentTuning: Enabling Generalized Agent Abilities for LLMs
https://thudm.github.io/AgentTuning/
1.36k stars 95 forks source link

if it is possible to conduct RLHF from env #51

Open SHITIANYU-hue opened 9 months ago

SHITIANYU-hue commented 9 months ago

Thanks for open-sourced agentTuning code , I am quite interested in training the model, i see the training framework is not open-sourced https://github.com/THUDM/AgentTuning/issues/1,

The discussion mentioned that it could support ptuning or LORA, i am also wondering if it could also support RLHF?

Recently, i read a paper: https://arxiv.org/abs/2312.14878, i am curious how the AgentLM performance would be if we could let it learn from interacting with environments. (refer to Finetune type II in that paper)

Btlmd commented 9 months ago

We haven't integrated RLHF methods into AgentTuning yet and we won't be releasing related experimental results recently. I believe that would be an awesome thing to try out.