thuml / Xlearn

Transfer Learning Library
465 stars 155 forks source link

Random sampling used in MMD #12

Closed yl-1993 closed 6 years ago

yl-1993 commented 6 years ago

In mmd layer, random sampling is used in both forward and backward computation.

https://github.com/thuml/Xlearn/blob/master/caffe/src/caffe/layers/mmd_layer.cu#L85-91 https://github.com/thuml/Xlearn/blob/master/caffe/src/caffe/layers/mmd_layer.cu#L144-150

There may be some problem if loss and gradient are computed with different samples. Maybe it is better to use a mask to store the selected samples as dropout layer does.

Besides, it is a little confused that the loss computed in forward is not used in backward computation.

yl-1993 commented 6 years ago

I see the point. The code is correct. It seems that the loss computed in forward is only a reference value and it is not related to backward, thus it is okay to compute gradients with other samples in backward.