I noticed that the train() method has a parameter checkpoint_to_load, which is currently unused in the code. I assume this would be part of a solution for saving and loading the current state of the model.
Would you be able to give any pointers on how to go about implementing something like this? I would like to be able to run this code on a machine that is not running 24/7, so being able to save and resume the training would be a great feature.
I noticed that the
train()
method has a parametercheckpoint_to_load
, which is currently unused in the code. I assume this would be part of a solution for saving and loading the current state of the model.Would you be able to give any pointers on how to go about implementing something like this? I would like to be able to run this code on a machine that is not running 24/7, so being able to save and resume the training would be a great feature.