isaaccorley / torchseg

Segmentation models with pretrained backbones. PyTorch.
MIT License
104 stars 8 forks source link

Integrate schedule free AdamW #18

Open JulienMaille opened 7 months ago

JulienMaille commented 7 months ago

Have you seen this optimizer? Anyone gave it a try? Seems a bit less straightforward to integrate to torchseg since most of our decoders will use BatchNorm. https://github.com/facebookresearch/schedule_free?tab=readme-ov-file

notprime commented 7 months ago

Hi @JulienMaille ,

I'll look into this as soon as I come back home, or maybe @isaaccorley can give it a look in the meanwhile. Btw you're right. By using timm as backbone for the encoders you can now select a specific norm layer, but when it comes to decoders we still have only BatchNorm.

We alredy had the idea to support different norm layers also for decoders, we just have to think about the best way to do that, because we obviously need to specify different parameters for different norm layers (as briefly outlined here).

Once we implement this functionality, integrating free AdamW should be easy.