shogun-toolbox / shogun

Shōgun
http://shogun-toolbox.org
BSD 3-Clause "New" or "Revised" License
3.03k stars 1.04k forks source link

Implement copula-based kernel dependence measure #1986

Open sejdino opened 10 years ago

sejdino commented 10 years ago

This is an entrance task (and part of the project) of http://www.shogun-toolbox.org/page/Events/gsoc2014_ideas#variable_interactions

See Copula-based Kernel Dependency Measures

Implement a copula-base kernel dependence measure class (under CKernelIndependenceTest). This gives an alternative to HSIC (implemented in cHSIC) in measuring nonlinear dependence between between the random variables X and Y based on samples. It has an advantage of being invariant to strictly increasing transformations of X and Y, but is only applicable to one-dimensional X and Y. Copula approach performs the empirical copula transformation (see Section 4 of the paper) of X and of Y which is for the samples {X(1),X(2),...X(m)} and {Y(1),Y(2),...Y(m)} given by ZX(i) = (1/m) * rank( X(i), {X(1),X(2),...X(m)} ), ZY(i) = (1/m) * rank( Y(i), {Y(1),Y(2),...Y(m)} ), where rank( x, A ) is the number of elements of A less than or equal to A. It then estimates MMD (implemented under CKernelTwoSampleTest) between the joint copula (ZX, ZY) and the uniform distribution on the square [0,1]^2.

Contact @karlnapf or @sejdino with questions

Irisxiaoxue commented 10 years ago

I have one question about this task. How could I use the MMD implemented under CKernelTwoSampleTest in copula-base kernel dependence measure class ? Should I define a new CKernelTwoSampleTest object in the new class, or should I copy the code lines into the new class? @sejdino

karlnapf commented 10 years ago

Hey Iris, the first way: Having an object of the MMD class you need. Copying code makes this harder to maintain.