Why use a positive learning rate in finetuning?

ikostrikov / implicit_q_learning

MIT License

226 stars 38 forks source link

Open QinwenLuo opened 1 year ago

QinwenLuo commented 1 year ago

In the file train_finetune, this code schedule_fn = optax.cosine_decay_schedule(-actor_lr, max_steps) seems to use a positive learning rate?why?