Loss function different than in original paper

jramapuram / BYOL

Bootstrap Your Own Latent (BYOL) pytorch implementation using DistributedDataParallel.

MIT License

28 stars 2 forks source link

Loss function different than in original paper #5

Closed zlenyk closed 4 years ago

zlenyk commented 4 years ago

Hi! Could you explain why your loss function is -2 torch.sum(x y, dim=-1) / (norm_x norm_y), but the original paper mentions "2 - 2(...)" ? Is this expected?

Thank you