Hi!
I am looking to recreate some of the results of the paper and is wondering about what optimizer and learning rate was used when training the LightM-UNet on the LiTs dataset?
In the paper it says:
SGD was employed as the optimizer, initialized with a learning rate of 1e-4.
The PolyLRScheduler was used as the scheduler, and a total of 100 epochs were
trained
Which appears to be somewhat close to the default used in nnUnet here.
Could you please clarify what optimizer and hyperparameters was used?
Also if anyone has a good intuition why you would use one over the other when choosing between Adam or SGD, I would love to hear it!
Hi! I am looking to recreate some of the results of the paper and is wondering about what optimizer and learning rate was used when training the LightM-UNet on the LiTs dataset?
In the paper it says:
Which appears to be somewhat close to the default used in nnUnet here.
However when looking through the nnUNetTrainerLightMUNet.py file it instead appears that Adam was used.
Could you please clarify what optimizer and hyperparameters was used? Also if anyone has a good intuition why you would use one over the other when choosing between Adam or SGD, I would love to hear it!