Eclectic-Sheep / sheeprl

Distributed Reinforcement Learning accelerated by Lightning Fabric
https://eclecticsheep.ai
Apache License 2.0
300 stars 29 forks source link

Save entire model on checkpoint #72

Closed belerico closed 11 months ago

belerico commented 1 year ago

Right now sheeprl save the state_dict of every model on checkpoint: this can be problematic when for example the model definition changes from when the checkpoint has been saved to when the checkpoint is loaded. As specified in the Fabric doc, it is better to save the entire models in the checkpoint, and resume them entirely when needed

belerico commented 11 months ago

Closed by #95