pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
42.62k stars 17.58k forks source link

[Feature Request] Add `nancorr_biweightmidcorr` to `pandas._libs.algos` #28657

Open jolespin opened 4 years ago

jolespin commented 4 years ago

Code Sample

This is probably the fastest implementation I am aware of at the moment: https://rdrr.io/cran/WGCNA/man/bicor.html

However this is also available but considerably slower and not parallelized: https://docs.astropy.org/en/stable/api/astropy.stats.biweight_midcorrelation.html

Problem description

Nothing popped up for the following searches: biweight https://github.com/pandas-dev/pandas/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+biweight+

bicor https://github.com/pandas-dev/pandas/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+bicor+

Expected Output

Fast implementation of biweight midcorrelation and extremely robust correlation metric

jolespin commented 4 years ago

Just curious on what the general thoughts were for this? Do you think it would be possible for future versions?

jreback commented 4 years ago

this PR https://github.com/pandas-dev/pandas/pull/9826

was actually pretty close but went stale a while back

would not object to adding this

jreback commented 4 years ago

a community PR would be the best way for this to progress