Hi, do you know what the best way to resume training from a previous checkpoint would be? Let's assume I am training for 100k steps but I have a 24-hour time limit, and I just have the following checkpoints available:
ls checkpoints/pretrain
vanilla_11081_12.0%.pth vanilla_11081_25.0%.pth vanilla_11081_50.0%.pth
Hi, do you know what the best way to resume training from a previous checkpoint would be? Let's assume I am training for 100k steps but I have a 24-hour time limit, and I just have the following checkpoints available:
Given that the generator and discriminator are instantiated as separated models, do we point them to the same .pth file? Also, I believe the
.from_pretrained()
method requires a singleconfig.json
so how do we merge the two configs if that is necessary?Thanks