Issue on page /20-Plug-and-Play-Estimators.html

Thanks for creating a great resource! The suggested pseudo outcome for the continuous treatment case is

y* = (y-y_pred)*(t-t_pred) / (t-t_pred)^2 = (y-y_pred) / (t-t_pred)

So it is the same as the R-learner but without weights. I would like to understand why this last simplification step is not made in the book, but instead you use (y-y_pred)*(t-t_pred) to only get a sign effect.

I am also wondering about the following: weighted linear regression is done by multiplying both Xand y with ws = sqrt(w) where w are the weights, i.e. Xw = X * ws, yw = y * ws such that beta = inv(Xw'Xw)*Xw'yw gives the weighted OLS coefficients. Thus with a linear final stage, the R-learner simply uses y-y_pred as outcome, and Xw as predictor matrix. I am wondering if this would not be a better approach also for non-linear final stages that don't support sampling weights i.e. using Xw to predict y-y_pred?

matheusfacure / python-causality-handbook

Issue on page /20-Plug-and-Play-Estimators.html #350