Open QinwenLuo opened 1 year ago
In the file train_finetune, this code schedule_fn = optax.cosine_decay_schedule(-actor_lr, max_steps) seems to use a positive learning rate?why?
In the file train_finetune, this code schedule_fn = optax.cosine_decay_schedule(-actor_lr, max_steps) seems to use a positive learning rate?why?