Closed yutaro-s closed 2 years ago
Hi,
--scale_loss
was a parameter that barlow had in their original version. It is not mentioned in the paper, but you can check the commit that removed it in their official repo some time ago https://github.com/facebookresearch/barlowtwins/commit/046eec3c7f5c098b42cdf43e04df332957637d6a.
It basically produces the same results, but they just wanted to remove this extra hyperparameter. We opted to keep it to be consistent with the checkpoints we have for the other datasets.
--exclude_bias_n_norm
exclude those parameters from LARS, as first described in BYOL's paper.
Thanks for your quick reply.
I undersntad the --scale_loss
and --exclude_grad_bias_norm
options
Let me confirm the following points:
--scale_loss
option as in the previous official implementation.Exactly, our implementation should match the original implementation from barlow.
I got it! Thank you.
Hi Could you teach me the effect of
--scale_loss
? Does the option normalize the gradient of batch normalization and bias parameters if I use it with--exclude_bias_n_norm
, as explained in the original Barlow Twins repository? Thank you.