FenTechSolutions / CausalDiscoveryToolbox

Package for causal inference in graphs and in the pairwise settings. Tools for graph structure recovery and dependencies are included.
https://fentechsolutions.github.io/CausalDiscoveryToolbox/html/index.html
MIT License
1.08k stars 198 forks source link

Error when using bnlearn packages with integer or character values for variables #120

Open OJ4227 opened 2 years ago

OJ4227 commented 2 years ago

I am working on windows, python version (3.9.1), cdt (0.5.23), pytorch (1.10.0).

The variables in the data I'm using to predict a DAG are discrete and can either take a value of -1, 0, 1, 2 or 3. When I use this data with any of the bnlearn algorithms I get the error:

RuntimeError: RProcessError R Process Error Output

Error in data.type(x) : variable X0 is not supported in bnlearn (type: integer). Calls: gs -> bnlearn -> check.data -> data.type Execution halted

However when I change the the values to -1.1, 0.1, 1.1, 2.1 and 3.1, there are no errors and it can predict a graph. Is it possible to get this to work with discete integer or character data? And does this mean the algorithm is applying the wrong independence tests, since in the CDT documentation it says that it can apply either discrete or continuous tests? I couldn't figure out whether I was able to pass an argument to GS() or GS().predict() to tell the R package what type of data it is and what test should be applied.

Any help advice would be greatly appreciated!

diviyank commented 2 years ago

Hello ! Sorry for the delayed answer, did you try changing the independence test for one fit for discrete values ?

Here is the doc on the available tests: https://fentechsolutions.github.io/CausalDiscoveryToolbox/html/causality.html#bnlearn-based-models

discrete case (categorical variables)

– mutual information: an information-theoretic distance measure.

It’s proportional to the log-likelihood ratio (they differ by a 2n factor) and is related to the deviance of the tested models. The asymptotic χ2 test (miand mi-adf, with adjusted degrees of freedom), the Monte Carlo permutation test (mc-mi), the sequential Monte Carlo permutation test (smc-mi), and the semiparametric test (sp-mi) are implemented.

– shrinkage estimator for the mutual information (mi-sh)

An improved asymptotic χ2 test based on the James-Stein estimator for the mutual information.

– Pearson’s X2 the classical Pearson’s X2 test for contingency tables.

The asymptotic χ2 test (x2 and x2-adf, with adjusted degrees of freedom), the Monte Carlo permutation test (mc-x2), the sequential Monte Carlo permutation test (smc-x2) and semiparametric test (sp-x2) are implemented .

discrete case (ordered factors)

– Jonckheere-Terpstraa trend test for ordinal variables.

The asymptotic normal test (jt), the Monte Carlo permutation test (mc-jt) and the sequential Monte Carlo permutation test (smc-jt) are implemented.

just give your wanted test to the score parameter of your algorithm

OJ4227 commented 2 years ago

Thanks for getting back to me

I tried to specify a score previously but when I tried to input a score I receive the error:

algorithms.append(cdt.causality.graph.bnlearn.GS(score='mi')) # Error! - bnlearn integer issue TypeError: __init__() got an unexpected keyword argument 'score'

I also wasn't sure based on the documentation what specific strings you're meant to input but it looks like from here that it's mi, mc-mi, x2 etc.?

Angela446-lgtm commented 2 years ago

Hello, did you solved that issue? I am trying to run bnlearn causal discovery algorithms- but I have to change the conditional independence test. I run into errors if I do : GS (score='mi-cg')-- TypeError: init() got an unexpected keyword argument 'score' Any suggestions?

diviyank commented 2 years ago

Hello, Sorry, error in the implementation ! I just checked, I'll fix this ASAP

Best, Diviyan

Angela446-lgtm commented 2 years ago

Thank you!!

diviyank commented 2 years ago

Hi, The fix is pushed to the github repo, but I need to migrate the CI to CircleCI to be able to push to PyPi/dockerhub Best, Diviyan

XMAHA commented 2 years ago

I use "m = GS(score='cor')", still got error: TypeError: init() got an unexpected keyword argument 'score'

diviyank commented 2 years ago

Yes the package isn't updated on pypi, I'll do it asap!

Le ven. 10 juin 2022, 05:41, XMAHA @.***> a écrit :

I use "m = GS(score='cor')", still got error: TypeError: init() got an unexpected keyword argument 'score'

— Reply to this email directly, view it on GitHub https://github.com/FenTechSolutions/CausalDiscoveryToolbox/issues/120#issuecomment-1151898332, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE5ZM5CX4DZUEV6TCAK6SCDVOK2NHANCNFSM5NVZUVNA . You are receiving this because you commented.Message ID: @.***>

XMAHA commented 2 years ago

Thank you. And I wonder if there are any methods to extract the test score value of GS algorithm, so we can know the dependency value. If we use the default mutual information score ('mi'), any methods in "cdt.independence.stats" can be used?

Yes the package isn't updated on pypi, I'll do it asap! Le ven. 10 juin 2022, 05:41, XMAHA @.> a écrit : I use "m = GS(score='cor')", still got error: TypeError: init() got an unexpected keyword argument 'score' — Reply to this email directly, view it on GitHub <#120 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE5ZM5CX4DZUEV6TCAK6SCDVOK2NHANCNFSM5NVZUVNA . You are receiving this because you commented.Message ID: @.>

diviyank commented 1 year ago

I think that it should be okaym, I should check the bnlearn doc. There might be some compatibility issues depending on the method. Sorry I was really busy, I'll update this soon

AMabona commented 1 year ago

bnlearn expects categorical variables to be factors. Just add dataset[sapply(dataset, is.character)] <-lapply(dataset[sapply(dataset, is.character)], as.factor) after https://github.com/FenTechSolutions/CausalDiscoveryToolbox/blob/d0bc352534dcbfac19a84a1bb05f33fe311378d2/cdt/causality/graph/R_templates/bnlearn.R#L25 . (I'm not sure why bnlearn doesn't do this internally tbh.)