Open jonkoi opened 5 years ago
With a very quick look it seems you are correct, seems it should have been reduce_mean( reduce_mean (...) ), right?
Not sure if this is a newly introduced bug during the reimplementation or not. The good thing is that the batch size (N) is constant for all experiments so qualitatively the results should not change. Changing the reduce_sum to reduce_mean is equal to just altering the weight on CCLP to get the same behaviour essentially.
I will leave this issue open and try to make time for making the change and running experiments to see that everything is consistent as expected.
Thanks a lot for spotting and raising. I may take a while to address this because I need to submit thesis in a week, but I will try to schedule change and experiments asap.
From what I see in this line, the existence of tf.reduce_sum causes the tf.reduce_mean to divide by N instead of N^2 as stated in equation (9) in the paper.