ikostrikov / implicit_q_learning

MIT License
226 stars 38 forks source link

Why use a positive learning rate in finetuning? #10

Open QinwenLuo opened 1 year ago

QinwenLuo commented 1 year ago

In the file train_finetune, this code schedule_fn = optax.cosine_decay_schedule(-actor_lr, max_steps) seems to use a positive learning rate?why?