Random sampling used in MMD

In mmd layer, random sampling is used in both forward and backward computation.

https://github.com/thuml/Xlearn/blob/master/caffe/src/caffe/layers/mmd_layer.cu#L85-91 https://github.com/thuml/Xlearn/blob/master/caffe/src/caffe/layers/mmd_layer.cu#L144-150

There may be some problem if loss and gradient are computed with different samples. Maybe it is better to use a mask to store the selected samples as dropout layer does.

Besides, it is a little confused that the loss computed in forward is not used in backward computation.

thuml / Xlearn

Random sampling used in MMD #12