mckinsey / causalnex

A Python library that helps data scientists to infer causation rather than observing correlation.
http://causalnex.readthedocs.io/
Other
2.24k stars 258 forks source link

do_intervention never ends running despite simple query #45

Closed ironcrypto closed 3 years ago

ironcrypto commented 4 years ago

Hi QB––

Description

I am running a do-calculus on a small dataset (116x32) with 2 to 4 discretized buckets. The BN fits the CPDs in 2 sec, so relatively good perf.

However a simple do-intervention takes forever and even never ends running, I waited several hours then I interrupted kernel.

Steps to Reproduce

$ from causalnex.inference import InferenceEngine $ ie = InferenceEngine(bn) $ ie.do_intervention("cD_TropCycl", {1: 0.2, 2: 0.8}) $ print("distribution after do", ie.query()["cD_TropCycl"])

Expected Result

Shouldn't it be running just a few seconds given the low number of buckets? How long does it normally take?

Actual Result

no results returned after hours running a simple query.

Your Environment

Include as many relevant details about the environment in which you experienced the bug:

Thank you very much!!

ironcrypto commented 4 years ago

Hello @qbphilip Any chance I get get an answer or help on this? Thank you

mingli2607 commented 4 years ago

Same problem when I do intervention on a little large dataset. Some nodes can be queried easily while some cannot be.

qbphilip commented 4 years ago

@ironcrypto Could you replicate this on a minimum example, e.g. on a small synthetic dataset?

ironcrypto commented 4 years ago

@qbphilip yes I think so. Do you have such dataset or should I create it?

SteveLerQB commented 4 years ago

Hi @ironcrypto, it would be great if you could provide us this dataset to debug this. Thanks :)

ironcrypto commented 4 years ago

@SteveLerQB Sure. here attached the Notebook and the dataset to reproduce. You will see that I interrupted the last two instructions of Do-Calculus as an effect of very high latency. I would be curious to see if you get the same problem or if error is on my side. Maybe a bad discretization policy? I doubt because it is pretty simple though.

github_qb_bug_45.zip

ironcrypto commented 4 years ago

Hi @SteveLerQB @qbphilip Have you been avale to reproduce the bug or do you observe the same latency? Thank you

FrancescaSogaroQB commented 3 years ago

thank you @ironcrypto for reporting this. This is also linked to issue #100. The above has been fixed with this commit will be available in the next CausalNex release.