sdatkinson / neural-amp-modeler

Neural network emulator for guitar amplifiers.
MIT License
1.87k stars 150 forks source link

[FEATURE] Allow to continue training at a checkpoint #372

Closed hnguyen0428 closed 10 months ago

hnguyen0428 commented 10 months ago

I would love to be able to continue training where the last model was left off. I always find it frustrating when I train for a certain amount of epochs, look at the logs and realize that the model has not yet converged. It would be cool to be able to continue training a model by loading in a checkpoint file or loading in a nam model, and continue to train where the last best epoch was left off instead of having to restart training from epoch 0.

sdatkinson commented 10 months ago

The CLI trainer supports this two different ways:

  1. Provide the checkpoint path under the "checkpoint_path" key at the top level of the model config JSON. This will cause training to restart using the weights contained in the checkpoint. This calls the LightningModule.load_from_checkpoint() class method to initialize the model.
  2. Provide the checkpoint path under the "ckpt_path" key at the top level of the learning config JSON. This will cause the training to start from the state in the checkpoint (model weights and optimizer state variables) using the Lightning Trainer's checkpointing functionality

These are essentially thin wrappers around Lightning's checkpointing functionality, so have a look there for more info.

I'm going to close this since the solutino already is implemented; otherwise you may want to check out #210.