Open wzm2256 opened 4 years ago
Let me explain why it is important. Because in generative modelling, we need to adjust the data and the Gaussian simultaneously to to match each other. Being biased, although just a little, will lead to zero variance.
Hi, I'm using this code for density estimation and generative modelling. I think Sinkhorn divergence is perfect for such tasks because of it unbiased nature. However, in practice, I find it is slight biased.
Basically, I have several datapoints and a Gaussian distribution. I want to match them so the datapoints can be regarded as samples from the Gaussian distribution. This is a basic setting in generative modelling. Ideally, this will work well because Sinkhorn divergence gets rid of entropic bias of regularized W-distance by adding two self correlation terms. However, the results are still biased.
I present my simple test code here.
I present several experimental facts of this code:
Sinkhorn divergence suffers from bias, and this bias decrease as \epsilon increase, in the limit, MMD works perfectly. When I fix the datapoint and learn the Gaussian distribution. The std of data is 0.97487277. With eps=0.01, 0.1, 0.5, 1, the learned stds of Gaussian are 0.96189475, 0.9575431, 0.96259254, 0.97405905, which are all smaller than the data std. With MMD (uncomment line 38), the learned std are near data std, it could be larger or smaller than than data std in different runs.
This is also true if I learn the data while fixed the Gaussian. Uncomment line 92-96. The std of Gaussian distribution is 1. With eps=0.01, 0.1, 0.5, 1, the data stds are 0.98536015, 0.9891638, 0.9949019, 0.9966444. With MMD, the learned data std is 1.0002198.