Closed jaried closed 2 years ago
Before executing offpolicy_trainer, I added the following code:
actor_optim.param_groups[0]['capturable'] = True
alpha_optim.param_groups[0]['capturable'] = True
critic1_optim.param_groups[0]['capturable'] = True
critic2_optim.param_groups[0]['capturable'] = True
can run, what is the reason?
This article says that it is caused by pressing ctrl+c to end the training? https://github.com/babysor/MockingBird/issues/631
But I use the previous version of the checkpoint file in question, and the same error is reported, why is this?
https://github.com/pytorch/pytorch/issues/80809
Someone said this:
Hi, I am also facing the same issue when I try to load the checkpoint and resume model training on the latest pytorch (1.12).
It seems to be related with a newly introduced parameter (capturable) for the Adam and AdamW optimizers. Currently two workarounds:
- forcing capturable = True after loading the checkpoint (as suggested above)
optim.param_groups[0]['capturable'] = True
. This seems to slow down the model training by approx. 10% (YMMV depending on the setup).- Reverting pytorch back to previous versions (I have been using 1.11.0).
I'm wondering whether enforcing
capturable = True
may incur unwanted side effects.
I'm also wondering about whether forcing captureable=True would have unwanted side effects. I will also return to torch1.11.
The
SAC
algorithm has been normal before. Recently, after reading the saved policy weight and optimizer, the first learning error is as follows. How can I solve it?