matheusfacure / python-causality-handbook

Causal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and causality.
https://matheusfacure.github.io/python-causality-handbook/landing-page.html
MIT License
2.61k stars 456 forks source link

Chapter 22-Non Parametric Double/Debiased ML #125

Open tiantiancoder opened 2 years ago

tiantiancoder commented 2 years ago

Thank you for your tutorials! But I am confused about the Non Parametric Double/Debiased ML method.

image

From the loss function, we can see its CATE is still a fixed constant for each unit X. So how does it learn the non-linearity CATE? Looking forward to your reply!

matheusfacure commented 2 years ago

It doesn't! It learns a local linear cate. I try to explain that in the following section. Have you read it? If it's not clear, let me know.

tiantiancoder commented 2 years ago

Thank you very much for answering my confusion. I have read the following section. When applying the Non-Parametric Double/Debias ML to the data that discount affects sales non-linearly, I want to know why the X is the discount residual and not the discount in the final non_param model. What I know is the X fitted into the final stage model in DML is features X not the residual.

debias_m = LGBMRegressor(max_depth=3)
denoise_m = LGBMRegressor(max_depth=3)

# orthogonalising step
discount_res =  discount.ravel() - cross_val_predict(debias_m, np.ones(discount.shape), discount.ravel(), cv=5)
sales_res =  sales.ravel() - cross_val_predict(denoise_m, np.ones(sales.shape), sales.ravel(), cv=5)

# final, non parametric causal model
non_param = LGBMRegressor(max_depth=3)
w = discount_res ** 2 
y_star = sales_res / discount_res

# here  why X is discount_res and not discount 
non_param.fit(X=discount_res.reshape(-1,1), y=y_star.ravel(), sample_weight=w.ravel());
matheusfacure commented 2 years ago

You are correct. X is what goes to the final model as the features. I can't find that piece of code. Can you point me to it? Here is what I found in the book

model_final = LGBMRegressor(max_depth=3)

# create the weights
w = train_pred["price_res"] ** 2 

# create the transformed target
y_star = (train_pred["sales_res"] / train_pred["price_res"])

# use a weighted regression ML model to predict the target with the weights.
model_final.fit(X=train[X], y=y_star, sample_weight=w);
tiantiancoder commented 2 years ago

I find it in the section named What is Non-Parametric About? of Chapter 22. here is the link https://matheusfacure.github.io/python-causality-handbook/22-Debiased-Orthogonal-Machine-Learning.html#what-is-non-parametric-about . it is in the third code block.

matheusfacure commented 2 years ago

Oh, I see. Thats a bug :) Since there are no features in this case, X should only be a constant hehe. I'll fix it.

matheusfacure commented 2 years ago

it should be

debias_m = LGBMRegressor(max_depth=3)
denoise_m = LGBMRegressor(max_depth=3)

# orthogonalising step
discount_res =  discount.ravel() - cross_val_predict(debias_m, np.ones(discount.shape), discount.ravel(), cv=5)
sales_res =  sales.ravel() - cross_val_predict(denoise_m, np.ones(sales.shape), sales.ravel(), cv=5)

# final, non parametric causal model
non_param = LGBMRegressor(max_depth=3)
w = discount_res ** 2 
y_star = sales_res / discount_res

non_param.fit(X=np.ones(discount_res.reshape(-1,1)), y=y_star.ravel(), sample_weight=w.ravel());
andreadisimone commented 2 years ago

Hi. Sorry to come back to this old issue, but I am am having the same difficulties in understanding the code. fitting the non_param with np.ones does not make much sense, does it? how am I going to run predictions on this? in the DGP the elasticity depends on the discount, so I would pass the discount as feature when predicting, so I should use the discount as X when fitting. am I missing anything?

Thanks for the great book,

Andrea.