PacktPublishing / Deep-Reinforcement-Learning-Hands-On

Hands-on Deep Reinforcement Learning, published by Packt
MIT License
2.83k stars 1.28k forks source link

Chapter 8: How to load data for continuation of training #20

Closed jeffCabrera0321 closed 5 years ago

jeffCabrera0321 commented 5 years ago

Hello, I am trying to continue training on previous data. My computer went into a forced restart. I started with, in the command prompt, "python -r YNDX_150101_151231.csv".

The last checkpoint was "checkpoint- 8.data" The last mean_val was "mean_val-0.828.data"

How would I continue train from where it left off?

Shmuma commented 5 years ago

Hi!

All the examples in the book were written without ability to restart in mind, as, in general, restart of training in RL domain is quite tricky thing to do. From my experience, most of the time it is simpler to start training from scratch rather than deal with lost replay buffer, overfitting in the beginning and other issues.

In your case, you can extend the example to support restart. To do that, you need to load network state from the checkpoint before the training loop. In PyTorch, you can load network weights from file using the single line:

network.load_state_dict(torch.load(FILE_NAME_WITH_WEIGHT, map_location=lambda storage, loc: storage))

Exactly that was done in chapter 6 play tool: https://github.com/PacktPublishing/Deep-Reinforcement-Learning-Hands-On/blob/master/Chapter06/03_dqn_play.py#L32