MrBlankness / LightM-UNet

Pytorch implementation of "LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation"
https://arxiv.org/abs/2403.05246
Apache License 2.0
279 stars 25 forks source link

What optimizer is used? #19

Open tureture opened 6 months ago

tureture commented 6 months ago

Hi! I am looking to recreate some of the results of the paper and is wondering about what optimizer and learning rate was used when training the LightM-UNet on the LiTs dataset?

In the paper it says:

SGD was employed as the optimizer, initialized with a learning rate of 1e-4. The PolyLRScheduler was used as the scheduler, and a total of 100 epochs were trained

Which appears to be somewhat close to the default used in nnUnet here.

However when looking through the nnUNetTrainerLightMUNet.py file it instead appears that Adam was used.

Could you please clarify what optimizer and hyperparameters was used? Also if anyone has a good intuition why you would use one over the other when choosing between Adam or SGD, I would love to hear it!