Loss value does not decrease

uni-medical / SAM-Med3D

SAM-Med3D: An Efficient General-purpose Promptable Segmentation Model for 3D Volumetric Medical Image

Apache License 2.0

427 stars 56 forks source link

Loss value does not decrease #28

Closed nekonekoni7 closed 4 months ago

nekonekoni7 commented 7 months ago

When I am training, the loss value fluctuates within a range and does not decrease after many epochs.The following error code was also output during training： UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate warnings.warn("Detected call of lr_scheduler.step() before optimizer.step().

blueyo0 commented 7 months ago

hi, nekonekoni7

This warning will not cause any side effects, you can ignore it now. As for your fluctuating loss, we suggest you check your data first. Since the lr and other hyper-parameters are validated, inappropriate data processing will prevent the model from learning useful knowledge.

Here's my preprocessing scripts link. Hope this can help you.

harpergith commented 6 months ago

My training loss also did not decrease, even though I followed the preprocessing steps in the provided preprocessing scripts. Actually in the scripts, the preprocessing step is just resampling the spacing to 1.51.5.15 mm. Am I correct? If so, then what are the reasons for the issue of training loss?

blueyo0 commented 6 months ago

I think the issue may lie in the mask. Could you provide more details about your tasks and the mask you used?

harpergith commented 6 months ago

I'm using the MSD07_pancreas dataset. The dataset contains two classes, i.e., pancreas and tumor. For the mask, I only extract the tumor class to get a {0,1} mask. I tried different batch sizes. With the default batch size 10, the loss stuck around 17, but if I set the batch size to 2, the loss can decrease from 15 to 2.

ramdhan1989 commented 6 months ago

I also experienced with this problem. I used my own dataset. anyone has a trick to improve the training?