pymc-labs / CausalPy

A Python package for causal inference in quasi-experimental settings
https://causalpy.readthedocs.io
Apache License 2.0
875 stars 63 forks source link

counterfactual analysis ANCOVA #230

Closed priamai closed 1 year ago

priamai commented 1 year ago

Hi there, I am quite new to all of this and I really liked the GeoLift example. Would it also be possible to estimate the effect of the treatment (refurbishment) to the outcome of the other countries? Is this possible and how can do that?

I was thinking maybe swap the variables but this wouldn't have learned the treatment. Any help is welcome!

result = cp.pymc_experiments.SyntheticControl(
    df,
    treatment_time,
    formula="Croatia ~ 0 + Austria + Belgium + Bulgaria + Denmark + Cyprus + Czech_Republic",
    model=cp.pymc_models.WeightedSumFitter(
        sample_kwargs={"target_accept": 0.95, "random_seed": seed}
    ),
)
drbenvincent commented 1 year ago

Hi @priamai. Thanks for the question. So I believe that a core part of synthetic control approach is that the treatment group is treated and the controls are entirely unaffected by the treatment.

I am not deep into geolift methods at this point, but I believe it is common to deal with multiple treated groups. In that case I believe the approach is to aggregate the treated groups into a new single aggregate treated column. In this case the untreated groups are still assumed to be unaffected by the treatment.

I'm not saying that it's impossible, but I believe vanilla synthetic control will not be able to achieve this.

So I'm going to close this issue. But if there's a published approach to do this, then feel free to open a new issue which gives all the details of the requested model and relevant papers or datasets (simulated or real).

priamai commented 1 year ago

Thanks for the suggestion, I will see what I can find!