neurodata / hyppo

Python package for multivariate hypothesis testing
https://hyppo.neurodata.io/
Other
218 stars 88 forks source link

Fast DCORR results - different from regular DCORR results? [Question] #384

Closed AvrahamBinn closed 1 year ago

AvrahamBinn commented 1 year ago

Hey,

First, thanks a lot for your cool and helpful library!

I was comparing your fast DCORR implementation with this version, which is correct as far as I can tell.

I used the example from their documentation:

a = [1,2,3,4,5] b = np.array([1,2,9,4,4]) distcorr(a, b) 0.762676242417

Using your version, I got ~ 0.3. This mismatch is consistent in other tests I did.

Are you familiar with this issue? Is this expected for some reason?

Thanks!

sampan501 commented 1 year ago

Hi can you send an example with results using both methods? Also are you using the latest version of hyppo?

sampan501 commented 1 year ago

Also, looking at it more closely, it seems they might be computing the biased statistic in that example. Try using bias=true and see if the results are the same

AvrahamBinn commented 1 year ago

Hey @sampan501 thanks for your response. I have just updated my hyppo to the latest version.

I am running this code -

x = np.array([1,2,3,4,5]) y = np.array([1,2,9,4,4])

d = Dcorr()

d.is_fast = True d.bias = True

d.test(x, y)

For d.bias = False I am getting IndependenceTestOutput(stat=0.5816750507471099, pvalue=0.3916083916083916) and for d.bias = True I am getting IndependenceTestOutput(stat=0.8164965809277251, pvalue=0.0899100899100899).

Both are different from the 0.7626762424168665 in the other implementation.

sampan501 commented 1 year ago

Found the bug, the issue is the statistic method should have returned the square root of the value. Fortunately, this change won't effect the p-value of permutation test. About to open a PR with the changes

AvrahamBinn commented 1 year ago

@sampan501 Thanks a lot for the explanation and the quick fix! :)