neurodata / hyppo

Python package for multivariate hypothesis testing
https://hyppo.neurodata.io/
Other
215 stars 90 forks source link

More Independence Tests Used for Causal Inference #312

Open adam2392 opened 2 years ago

adam2392 commented 2 years ago

Is your feature request related to a problem? Please describe. In causal inference - causal discovery problems, there are two approaches: i) constraint-based and ii) score-based.

Constraint-based algorithms are based on conditional independence testing, which is severely lacking in Python, hence why I use hyppo. Most are in R, but not Python.

There are a number of simple CI tests that would be beneficial (imo) to add to hyppo. These simple CI tests might be worse and less powerful compared to the ones currently in hyppo, but might be significantly faster and also useful as a benchmark.

Describe the solution you'd like For example, the PC algorithm developed in R uses the G-squared test for binary data.

https://github.com/keiichishima/gsq/

In addition, other less complex conditional independence tests are desirable for the sake of just being able to run on simple simulations.

Additional context (e.g. screenshots) Obviously the tests are limited. E.g. G-squared is for binary data and/or discrete data in the linked repo. However, this can just be explicitly noted in the docs.

rflperry commented 2 years ago

@adam2392 fyi this is a relatively recent package that may be of interest https://github.com/cmu-phil/causal-learn

rflperry commented 2 years ago

See also this package for more implementations, with better documentation https://conditional-independence.readthedocs.io/en/latest/ci_tests/index.html