google-research / big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
Apache License 2.0
2.2k stars 147 forks source link

implement_gsam_jax #4

Closed juntang-zhuang closed 2 years ago

juntang-zhuang commented 2 years ago

Implement GSAM algorithm proposed in Surrogate gap minimization improves sharpness-aware training, ICLR 2022, which is an improvement over SAM (Sharpness-Aware Minimization)

When config.rho_max == config.rho_min and config.alpha=0.0, the GSAM algorithm reduces to SAM.

akolesnikoff commented 2 years ago

Hi,

Thank you for contribution. As stated in the readme, we normally do not accept external contributions, but we are happy to make an exception for open-source implementations of published projects developed in big_vision.

However, according to the codebase principles, project-specific code should not add complexity to the core library parts, such as the main train loop. Thus, standalone projects are expected to fork the main train loop into big_vision/trainers/proj/<project name>/... and apply necessary modifications there. We plan to submit an example of how this works soon (~2 weeks from now). Maybe you wait for the example, and then update this pull request accordingly?

juntang-zhuang commented 2 years ago

Thanks a lot for the clarification! I will re-format and re-submit later according to the examples.

lucasb-eyer commented 2 years ago

hey, we now have an example of a project-specific trainer here: https://github.com/google-research/big_vision/tree/main/big_vision/trainers/proj/distill

If you are still interested in submitting gsam (we would like it!), could you sync to head and instead of modifying the core train.py, fork it into trainers/proj/gsam/train.py and do the modifications there?

Sorry for the delay on our side!

juntang-zhuang commented 2 years ago

Thanks a lot for the example! I have moved all changes to big_vision/trainers/proj/gsam, please let me know if it looks good.

lucasb-eyer commented 2 years ago

Also, once you're done, could you squash all the commits into just a single one?