Open sampan501 opened 4 years ago
I've added the distance_correlation function (and its helpers) to my fork of SciPy in the ben-master branch.
The distance_correlation function is modeled off of the multiscale_graphcorr function which is already merged into SciPy and the helpers are from the hyppo repo (primarily the dcorr function).
I've also added a simple test (more to see that the code runs and returns an expected result rather than for accuracy). The function passes the test. This link goes directly to the distance_correlation function which relies on multiple other functions in the _stats.pyx file.
See attached screenshot of the test passed. Tests are in the same branch here. @sampan501
Suggested future steps include:
None of the functions that you added have been Cythonized. All the functions added use def
which is calling pure Python
Then I don’t really understand why cythonizing is necessary if pure python works. @sampan501
Then I don’t really understand why cythonizing is necessary if pure python works.
Pure python is slow. With large datasets, it will long time to run.
Oh I see. I wasn't aware of that being an essential part for any reason beyond making the code run-able in SciPy.
Updates:
_dcorr
has been cythonized_center_distance_matrix
has been changed to work for biased and unbiased data and will now work for mgc (multi-scale graph correlation) and dcorr (distance correlation).distance_correlation
function now uses _dcorr
in the pvalue computation function _perm_test
to compute the test statistic rather than _mgc_stat
distance_correlation
function call tree@sampan501
Documentation for the distance_correlation
function still needs to be changed (it is currently just copied from multiscale_graphcorr
.
Can I be assigned this issue?
Can I be assigned to this issue?