Closed chenzhang9476 closed 1 month ago
Hey, let me tag @TaWald and @saikat-roy here since they have the most experience with this kind of stuff. My 2 cents:
Best, Fabian
Hey @chenzhang9476. Just following up on @FabianIsensee here. In our experience, when we trained SwinUNet using nnUNet as the training framework, we had to reduce the learning rate to 1e-4. We did use AdamW as the optimizer instead of SGD. But my guess is that, you would probably need to reduce the learning rate on SGD as well.
Thank you.
But I’m confussing about the deep supervision.
On Tue, 28 May 2024 at 7:15 PM, Saikat Roy @.***> wrote:
Hey @chenzhang9476 https://github.com/chenzhang9476. Just following up on @FabianIsensee https://github.com/FabianIsensee here. In our experience, when we trained SwinUNet using nnUNet as the training framework, we had to reduce the learning rate to 1e-4. We did use AdamW as the optimizer instead of SGD. But my guess is that, you would probably need to reduce the learning rate on SGD as well.
— Reply to this email directly, view it on GitHub https://github.com/MIC-DKFZ/nnUNet/issues/2197#issuecomment-2134728362, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWAKKBLQZKYKEPZYRNGNOKDZERDMHAVCNFSM6AAAAABH3WHPXOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZUG4ZDQMZWGI . You are receiving this because you were mentioned.Message ID: @.***>
Hey @chenzhang9476 . Can you clarify what you mean by confused? Are you trying to switch off deep supervision or are you trying to use it but are unsuccessful?
Is deep supervision compliable with the other new framework like Swin-Unet?
Hey @chenzhang9476. It is compatible in principle as long as you configure the underlying architecture/ model to provide deep supervision like outputs to the underlying trainer.
Are you trying to do this for SwinUnet? Can you tell us where you are stuck?
Hi, all.
I know this isn't your obligation, but just wanna post and see if any of you tried to do similar thing like me before. I'm trying to use nnUNet framework with Swin-Unet, which is transformer-based network. This is what i encountered. As you can see, all the loss become a and pseudo dice is nan, this seems cannot be modified, I tried several times. I simple put Swin-Unet under build_network_architecture function (but only used when training, converting data is still unet framework, otherwise cannot success.)
Thank for any advice.