matteocourthoud / Blog-Posts

Code and notebooks for my Medium blog posts
113 stars 57 forks source link

Question on Conclusion of this ROI Notebook #4

Open BrianMiner opened 1 year ago

BrianMiner commented 1 year ago

This is a great notebook, enjoyed reading it. I do have one question that is really bugging me.

It is established that creating an auxiliary variable of revenue divided by cost:

df["rho"] = df["revenue"] / df["cost"]
smf.ols("rho ~ new_machine", df).fit().summary().tables[1]

does not represent $\frac{\Delta R}{\Delta C}$

But if this is the case, how does the regression at the end, conducted on an auxiliary variable, which is essentially df["revenue"] - df["cost"] and a couple constants, adequately represent essentially $\Delta R - \Delta C$ ? Isnt this the same thing as above in concept?

matteocourthoud commented 1 year ago

Hi Brian, thanks for the question!

The intuition behind the result is indeed not trivial. However, remember that the purpose of the auxiliary variable is not to "represent" $\frac{\Delta R}{ \Delta C}$, but to generate a variable whose difference-in-means has the same variance as the ROI estimator.

For what concerns intuition, I can give you a different way to build the estimator, which is maybe more intuitive. You can also rewrite the estimator as an extremum estimator, i.e. the solution of $\hat{\rho} : \ \mathbb{E} [ \Delta R - \rho \Delta C]=0$

or equivalently $\hat{\rho} = \arg\min_\rho \mathbb{E} [ \Delta R - \rho \Delta C]^2$

I.e. you want to estimate the ROI as the value that implies the same "baseline revenue" across treatment and control group, on average. With this approach, you get the same variance, but with a different interpretation: is the variance of the objective function, $Var[ \Delta R - \rho \Delta C]$, multiplied by its inverse squared derivative with respect to $\rho$, $\frac{1}{\mathbb{E}[\Delta C]^2}$. Maybe this way is more intuitive because it makes clear where the "baseline revenue" comes from: its expected value is the objective function.

I might add this to the article since (maybe) it can be helpful to others as well.