Open anirbala98 opened 11 months ago
Keeping other parameters unchanged, if I try to use SGD instead of AdamW optimizer for Segformer model(modified with a Swin backbone), the results are so bad. I do get decent results with AdamW though. Only AdamW works with Segformer?
Keeping other parameters unchanged, if I try to use SGD instead of AdamW optimizer for Segformer model(modified with a Swin backbone), the results are so bad. I do get decent results with AdamW though. Only AdamW works with Segformer?