Model Training Unstable（webshop，gpt2）

I encountered an issue while trying to reproduce the results by loading the gpt2_bc_webshop_history.pt model and running the run.py script. The training was initiated with the following parameters:

8*GPUS """ epochs=50 actor_epochs=3 batch_size=8 grad_accum_steps=4 capacity=10000 critic_lr=6e-5 lm_lr=3e-5 rollout_size=512 gamma=0.9 tau=0.1 agent_type="archer" webshop_lower: 2000 webshop_upper: 2100 """

However, I noticed that during training, the eval_rollout.mean value barely increases, and in many cases, the training either crashes (with rewards becoming zero) or the performance deteriorates. To mitigate the issue, I tried lowering the learning rate and reducing the number of actor updates, which seemed to prevent the model from crashing.

I would like to understand the potential reason for this behavior and whether my parameter settings are appropriate. Could you help clarify if I am missing something or suggest adjustments to make the training more stable?

YifeiZhou02 / ArCHer

Model Training Unstable（webshop，gpt2） #14