MastodonC / kixi.stats

A library of statistical distribution sampling and transducing functions
https://cljdoc.xyz/d/kixi/stats
Eclipse Public License 1.0
354 stars 17 forks source link

Calculating a "partial correlation"? #48

Open kovasap opened 3 months ago

kovasap commented 3 months ago

I'm interested in calculating correlation coefficients between two variables in a dataset, but controlling/adjusting for a third variable. I think this answer describes how to do what I want in R: https://stats.stackexchange.com/a/171497. And https://stats.stackexchange.com/questions/76815/multiple-regression-or-partial-correlation-coefficient-and-relations-between-th has some more background. I couldn't find similar functionality in this library, but I wanted to ask here in case I'm missing something, or in case this would be easy to implement on top of what already exists here.

henrygarner commented 3 months ago

@kovasap thanks for the excellent question. This is not something that is supported in kixi.stats currently. In my experience implementations would normally make use of matrix decomposition or numerical optimisation strategies, neither of which are currently incorporated. This partly reflects a desire to keep the library faithful to its original goals: a lightweight collection of useful transducer-compatible reducing functions, and partly a lack of apparent need.

If you have that need, and if it could be met without reliance on sizeable dependencies (I’m thinking of Apache commons math), then partial correlation and its cousin multiple regression would make great additions. PRs in this area welcome!