p-value NaN in some pathological cases with non-bootstrap method

Padarn commented 1 year ago

Describe the bug In some pathological cases its possible for the p-value of a refuter to be NaN: In particular if al of the simulations return the same value.

Identified while looking at https://github.com/py-why/dowhy/issues/804

Steps to reproduce the behavior

import random
random.seed(1)

from dowhy import CausalModel
import dowhy.datasets

data = dowhy.datasets.linear_dataset(
    beta=10,
    num_common_causes=3,
    num_instruments=2,
    num_samples=10000,
    treatment_is_binary=True)

model = CausalModel(
  data=data["df"],
  treatment=data["treatment_name"],
  outcome=data["outcome_name"],
  graph=data["gml_graph"])
print("identify")
identified_estimand = model.identify_effect()
print("estimate")
estimate = model.estimate_effect(identified_estimand,
                                 method_name="backdoor.propensity_score_matching")
print("refute")
refute_results = model.refute_estimate(identified_estimand,
                                   estimate,
                                   method_name="random_common_cause",
                                   # placebo_type="permute",
                                   num_simulations=20, show_progress_bar=True)
print(refute_results)

Produces

Refute: Add a random common cause
Estimated effect:10.720735355706834
New effect:10.720735355706834
p value:nan

Expected behavior This is unclear, which is why I am opening an issue rather than submitting a bug report.

Version information:

DoWhy version installed from main at commit 97e6bdc3db137280fdb8812dfba34de14a248c72

The root cause of this is in the p-value calculation of the refuter which assumes that the standard deviation of the simulations is well defined.

This would be easy to fix by setting the p-value to 1 in this scenario. WDTY?

amit-sharma commented 1 year ago

hey @padarn, did your PR #806 fix this issue too? Or do you still see this error?

Padarn commented 1 year ago

Sorry for the slow reply: No it doesn't fix the issue. I wasn't sure if the behaviour I suggest above would be the desired one. If so I'll submit a new PR.

py-why / dowhy

p-value NaN in some pathological cases with non-bootstrap method #807