Closed antoninschrab closed 3 years ago
Thanks for reporting this. I will try to have a look soon and get back.
Am Sa., 28. Nov. 2020 um 17:46 Uhr schrieb antoninschrab < notifications@github.com>:
I am running into issues when trying to compute the unbiased (full) estimate of MMD^2 for a given kernel matrix in Python. The answer provided by shogun does not match the answer I expect to obtain. This might come from the following observation: changing the diagonal entries of the kernel matrix changes the value of the unbiased (full) estimate of MMD^2 returned by shogun which should not be the case. Here are two reproducible examples:
import shogun as sgimport numpy as np def my_mmd_unbiased(m, K): """ m: number of samples from P K: kernel matrix: np.array [XX, XY YX, YY] """ n = K.shape[0]-m # number of samples from Q XX = K[:m,:m] XY = K[:m,m:] YY = K[m:,m:] np.fill_diagonal(XX, 0) np.fill_diagonal(YY, 0) return np.sum(XX)/(m(m-1)) + np.sum(YY)/(n(n-1)) - 2np.sum(XY)/(nm) def shogun_mmd_unbiased(m, n, K): """ m: number of samples from P n: number of samples from Q K: kernel matrix: np.array of shape (m+n,m+n) """ kernel = sg.CustomKernel(K.astype(np.float64)) mmd = sg.QuadraticTimeMMD() mmd.set_p(sg.DummyFeatures(m)) mmd.set_q(sg.DummyFeatures(n)) mmd.set_kernel(kernel) mmd.set_statistic_type(sg.ST_UNBIASED_FULL)
compute_statistic() returns MMD^2 scaled by n*m/(m+n)
return mmd.compute_statistic()*(n+m)/m/n
np.random.seed(0)m = n = 1000K = np.random.rand(m+n,n+m)K = K + K.T print(my_mmd_unbiased(m, K))print(shogun_mmd_unbiased(m, n, K))K[0,0] = 7print(shogun_mmd_unbiased(m, n, K))K[0,0] = 2e7print(shogun_mmd_unbiased(m, n, K))K[0,0] = 2e8print(shogun_mmd_unbiased(m, n, K))
0.0009910327956028642 (my mmd) 0.0009515195270068944 (shogun mmd with K[0,0] = 1.0976270078546495 (unchanged)) 0.0009513943805359304 (shogun mmd with K[0,0] = 7) 0.0031087391544133425 (shogun mmd with K[0,0] = 2e7) -0.9994037747383118 (shogun mmd with K[0,0] = 2e8)
K = np.array([[0,2,1,2], [2,0,3,4], [1,3,0,4], [2,4,4,0]])m,n = 2,2print(my_mmd_unbiased(m, K))print(shogun_mmd_unbiased(m, n, K))K[0,0] = 2e8print(shogun_mmd_unbiased(m, n, K))
1.0 (my mmd) 1.0 (shogun mmd with K[0,0] = 0 (unchanged)) -1.0 (shogun mmd with K[0,0] = 2e8)
So changing values of the diagonal entries of the kernel matrix changes the ST_UNBIASED_FULL estimate. The same observation also applies to the ST_UNBIASED_INCOMPLETE estimate where changing the diagonal entries or the k(x_i,y_i) entries changes the estimate while it should not.
Any idea what is happening?
Thanks!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/shogun-toolbox/shogun/issues/5133, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFKVP2UWCXORJS2T4CO3HLSSEZOHANCNFSM4UF52Y2A .
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue is now being closed due to a lack of activity. Feel free to reopen it.
I am running into issues when trying to compute the unbiased (full) estimate of MMD^2 for a given kernel matrix in Python. The answer provided by shogun does not match the answer I expect to obtain. This might come from the following observation: changing the diagonal entries of the kernel matrix changes the value of the unbiased (full) estimate of MMD^2 returned by shogun which should not be the case. Here are two reproducible examples:
So changing values of the diagonal entries of the kernel matrix changes the ST_UNBIASED_FULL estimate. The same observation also applies to the ST_UNBIASED_INCOMPLETE estimate where changing the diagonal entries or the k(x_i,y_i) entries changes the estimate while it should not.
Any idea what is happening?
Thanks!