Loss divide by N instead of N^2?

With a very quick look it seems you are correct, seems it should have been reduce_mean( reduce_mean (...) ), right?

Not sure if this is a newly introduced bug during the reimplementation or not. The good thing is that the batch size (N) is constant for all experiments so qualitatively the results should not change. Changing the reduce_sum to reduce_mean is equal to just altering the weight on CCLP to get the same behaviour essentially.

I will leave this issue open and try to make time for making the change and running experiments to see that everything is consistent as expected.

Thanks a lot for spotting and raising. I may take a while to address this because I need to submit thesis in a week, but I will try to schedule change and experiments asap.

Kamnitsask / ssl_compact_clustering

Loss divide by N instead of N^2? #1