Kaixhin / Rainbow

Rainbow: Combining Improvements in Deep Reinforcement Learning
MIT License
1.56k stars 282 forks source link

Add ability to resume training #27

Open Kaixhin opened 6 years ago

stringie commented 5 years ago

This is very much needed as I don't have a powerful enough machine to just run once. There needs to be a save state to get back to.

Kaixhin commented 5 years ago

I was thinking about closing this because actually it would require saving the replay memory, which is about 7GB. Clearly it would still be a useful feature to have, so I'll leave this open in case I or someone else comes up with a nice way of serialising everything.

guydav commented 5 years ago

I've implemented something to this effect just by pickling the memory and loading a checkpoint. My code is a little coupled to where and how I store these saved files, but I can try to decouple it to share it, if that might be useful?

Kaixhin commented 5 years ago

@guydav that does sound very useful! Perhaps a --checkpoint-interval flag which if nonzero saves the checkpoint in the results directory? Resuming is the trickier part.

guydav commented 5 years ago

See https://github.com/Kaixhin/Rainbow/pull/58 for the implementation details. I guess I now made checkpointing true by default and at the same interval as the evaluation interval, but it doesn't have to be default if you'd prefer it not to.

I think the resuming is not too hard, and I handled it through a few flags. Let me know what you think?