jamalsenouci / causalimpact

Python port of CausalImpact R library
Apache License 2.0
265 stars 63 forks source link

Only allows with a predictor time series #3

Closed dom-devel closed 6 years ago

dom-devel commented 7 years ago

Hello! Been really enjoying the library as it's saved me having to either re-learn R or get R and pandas talking.

One notable question:

In analysis.py you currently block any attempt to use the model without a predictor time series. In R it's still possible to use it with a single Series and it remains useful (although obviously without a strong control actually measuring uplift for example is very hard).

Was just wondering about the decision behind this, is it a functional thing or just a spare time thing? (with this presumably just being a fun side project).

I can bypass it by looking for the first nonzero value in the first column rather than the second, or just forcing to take the max from the pre-period, but this just gives me a flat line prediction unlike in R and at this point I'm reaching the end of my skills.

if data.shape[1] == 1:  # no exogenous values provided
        raise ValueError("data contains no exogenous variables")
non_null = pd.isnull(data.iloc[:, 1]).nonzero()
first_non_null = non_null[0]
jamalsenouci commented 6 years ago

Hello! It was sort of intentional, the reasoning was as follows.

If you are not using exogenous variables as your control then you should be providing a custom model rather than the default one. Which means you can do it - see section 8 in https://github.com/jamalsenouci/causalimpact/blob/master/GettingStarted.ipynb (if you build a custom model without an exog parameter it should do what you want)

It's slightly prescriptive but I was hoping it was a bit more explicit about the model that is being used.