Open srijandas07 opened 3 years ago
Yes, I think so. Since MoCov3 requires a relative large batch size, so LARS is a better choice. Please refer to https://github.com/facebookresearch/simsiam/blob/main/main_lincls.py#L232 or https://github.com/facebookresearch/barlowtwins/blob/main/main.py#L224 for specific implementation.
I guess the optimizer choice should be either LARS or LAMB based on the encoder (convolutional/transformer). Isn't it?