Framework for independence testing and conditional independence testing, with multiprocessing. Currently uses mutual information (MI) and conditional mutual information (CMI) as test statistics, estimated using k-NN methods. Also supports a routine for Markov blanket feature selection. Reports permutation-based p-values.
pip install pycit
ksg_mi
: k-NN estimator for continuous databi_ksg_mi
: "bias-improved" k-NN estimator for continuous datamixed_mi
: k-NN estimator for discrete-continuous mixturesksg_cmi
: k-NN estimator for continuous databi_ksg_cmi
: "bias-improved" k-NN estimator for continuous datamixed_cmi
: k-NN estimator for discrete-continuous mixturesNote: Also includes a differential entropy estimator: kl_entropy
.
from pycit import itest
# Test whether or not x and y are independent
pval = itest(x, y, test_args={'statistic': 'ksg_mi', 'n_jobs': 2})
is_independent = (pval >= 1.- confidence_level)
from pycit import citest
# Test whether or not x and y are conditionally independent given z
pval = citest(x, y, z, test_args={'statistic': 'ksg_mi', 'n_jobs': 2})
is_conditionally_independent = (pval >= 1.- confidence_level)
from pycit.markov_blanket import MarkovBlanket
# specify CI test configuration
cit_funcs = {
'it_args': {
'test_args': {
'statistic': 'ksg_mi',
'n_jobs': 2
}
},
'cit_args': {
'test_args': {
'statistic': 'ksg_cmi',
'n_jobs': 2
}
}
}
# find Markov blanket of Y. x_data contains data from predictor variables, X_1,...,X_m
mb = MarkovBlanket(x_data, y_data, cit_funcs)
markov_blanket = mb.find_markov_blanket()
numpy
scipy
scikit-learn