brian-lau / highdim

Statistics for high-dimensional data (homogeneity, sphericity, independence, spherical uniformity)
GNU General Public License v3.0
17 stars 4 forks source link

Add fast HSIC #6

Open brian-lau opened 7 years ago

brian-lau commented 7 years ago

https://github.com/devinjacobson/prediction/tree/master/correlations/samplecode/NIPS2011-code/code-nips2011

wittawatj commented 7 years ago

I have not read the paper you posted. Since you put as the title "fast HSIC", I just want to point out

Large-Scale Kernel Methods for Independence Testing Qinyi Zhang, Sarah Filippi, Arthur Gretton, Dino Sejdinovic https://arxiv.org/abs/1606.07892

This paper discusses many variants of HSIC.

The first two are linear-time wrt sample size, with quadratic dependency on the number of features (or number of inducing points in Nystrom). The last one does not have good power because of its high variance (this is mentioned in the paper).

brian-lau commented 7 years ago

Thanks for the reference! I haven't looked at the paper either, but I'm pretty sure the fastHSIC in the link is not linear-time, so I will definitely have a look at the Zhang paper.