About continue training

nicklashansen / tdmpc2

Code for "TD-MPC2: Scalable, Robust World Models for Continuous Control"

MIT License

272 stars 49 forks source link

train.py currently does not use the checkpoint argument, only evaluate.py does. This seems like a very reasonable request though, I will issue a commit soon with this functionality.

Copying this line https://github.com/nicklashansen/tdmpc2/blob/f3139291e2dc8e47480184a4a1bce05e8980caa3/tdmpc2/evaluate.py#L59 from evaluate.py to train.py (potentially with a few extra checks / warnings) should do the trick.

One thing to note though is that agent.save() only stores the model weights at the moment, not the optimizer state which would be needed for seamless resuming of the training run. I can add that to the checkpoint files, but it will roughly double the file size.

nicklashansen / tdmpc2

About continue training #7