MIC-DKFZ / nnUNet

Apache License 2.0
5.9k stars 1.76k forks source link

Initial learning rate stuck in v2 #2232

Closed thegooby closed 5 months ago

thegooby commented 5 months ago

Hello, I've recently encountered a problem. When I run something such as nnUNetv2_train 403 3d_fullres all --npz , the initial current learning rate that displays no longer matches what I have in .../nnUNet/nnunetv2/training/nnUNetTrainernnUNetTrainer.py . Just a few days ago, I was having no issues modifying self.initial_lr. However, now it is stuck at 1e-2 no matter what I set the value as. I've tried deleting the pycache folder in the folder that holds the trainer. I have not updated any packages. I have tried closing and reopening the terminal after changing the self.initial_lr, restarting the computer, etc. None of these have fixes the problem, and this is the first time I've run into this issue with v2. The problem persists when using datasets numbered above 500.

I'm not sure if this is a recent issue, but I get this warning on executing that line

.../anaconda3/lib/python3.11/site-packages/torch/optim/lr_scheduler.py:28: UserWarning: The verbose parameter is deprecated. Please use get_last_lr() to access the learning rate.
  warnings.warn("The verbose parameter is deprecated. Please use get_last_lr() "
FabianIsensee commented 5 months ago

Hey, I just verified with nnUNetTrainerVanillaAdam3en4 https://github.com/MIC-DKFZ/nnUNet/blob/d12a0c1c02c18b23918fe3d0cf94f56086ef3e85/nnunetv2/training/nnUNetTrainer/variants/optimizer/nnUNetTrainerAdam.py#L38 that changing the initial learning rate still works. There is no reason why it shouldn't. Please make sure to set your initial lr AFTER calling super().__init__(...)! Best, Fabian

thegooby commented 5 months ago

Thank you for looking into this. In April, I added export PATH=~/anaconda3/bin:$PATH to the end of my bashrc for another project. I'm not sure why it took so long for the changes to take place, but it works properly now with that line commented out.