MIC-DKFZ / nnDetection

nnDetection is a self-configuring framework for 3D (volumetric) medical object detection which can be applied to new data sets without manual intervention. It includes guides for 12 data sets that were used to develop and evaluate the performance of the proposed method.
Apache License 2.0
530 stars 89 forks source link

[Question]extending no of epochs of training #247

Open Rajesh-ParaxialTech opened 1 month ago

Rajesh-ParaxialTech commented 1 month ago

Hello

Suppose i have started training an nnDetection model fixing the no of epochs to 100. Later if i want to resume training beyond 100 epochs, can i update the code and resume the training with the option mode=resume ? without loosing the weights of the model learnt during the first 100 epochs.

Thanking you

Rajesh

mibaumgartner commented 1 month ago

Hey @Rajesh-ParaxialTech ,

yes, it is possible to resume the training by exchanging the mode and specifying the new number of epochs. There are a few caveats though: (1) the learning rate schedule depends on the total number of epochs, thus overwriting the number of epochs will change the learning rate schedule (i.e. it might have been quite low towards the end of training one but will start with a higher learning rate in training two until it decreases towards the end again) (2) the training ends with SWA, which will periodically increase & decrease the learning rate before averaging the model weights. If you restart after SWA, the model will also be different than a single long training.

Best, Michael

github-actions[bot] commented 1 week ago

This issue is stale because it has been open for 30 days with no activity.