pymc-labs / CausalPy

A Python package for causal inference in quasi-experimental settings
https://causalpy.readthedocs.io
Apache License 2.0
827 stars 51 forks source link

Multivariate synthetic control #252

Open kloskas opened 9 months ago

kloskas commented 9 months ago

It doesn't seem that synthetic control can manage multiple features per unit (i.e. can't manage other than the target outcome per each unit under treatment).

Is this functionality considered for future improvements?

drbenvincent commented 9 months ago

Hi. It's not clear to me if you are talking about multivariate outcome or additional predictors. Can you point me to a paper, or package, or just give a more complete description of what you mean?

kloskas commented 9 months ago

Yes, I was thinking about multivariate in units. This package for example [https://github.com/OscarEngelbrektson/SyntheticControlMethods/tree/master] (https://github.com/OscarEngelbrektson/SyntheticControlMethods/tree/master) manages units with multiple features other than the target (different format: time + units are in rows, all features are in columns).

More examples, https://cran.r-project.org/web/packages/MSCMT/MSCMT.pdf package in R (amongst several others in R, much more frequent in this software) that considers W weights per unit + V weights per multivariate (shared between units)

I think this is a fantastic package but my impression is we only can consider the outcome variable for each unit (i.e. GDP in famous Germany example), which misses the opportunity to include multivariate enrichment of units for a granular weighted fit (i.e. GDP, industrialization, wage level, unemployment level... each one with its weight for each country)

If this is something that is already possible I'd appreciate an example?

drbenvincent commented 9 months ago

Thanks, this clears things up a lot. I've only taken a brief look so far, but I don't see any major reason why this couldn't be done. Very initial thoughts, but I think we'd need:

Adding this new functionality would be cool. Can't promise anything about the timeline at this point. We'd certainly be open to pull requests. Otherwise I'm very happy to have this as an open feature request. If this is for a commercial application then it could be expedited by engaging PyMC Labs in some capacity - if that is relevant, feel free to get in touch.