Closed ShaileeImrith closed 2 years ago
I don't think that's a problem.
Because if D is binary, the residuals will be "D(1or0) - estimated probability", which is a continuous variable, since self._model.predict_proba
is used.
https://github.com/microsoft/EconML/blob/master/econml/dml/dml.py
Hi, thank you for your comment. I am still a bit confused though. A classifier would predict binary values (0 or 1, like say, sklearn's randomforestclassifier) and therefore the residuals_d will also take discrete values, not continuous. So my question is : is there anything conceptually wrong with residuals_d being discrete (values can be 0, 1 or -1)? The final ols step then is a linear regression of the continuous residuals_y on the discrete residuals_d?
Hi!
is there anything conceptually wrong with residuals_d being discrete (values can be 0, 1 or -1)? The final ols step then is a linear regression of the continuous residuals_y on the discrete residuals_d?
I think that the output of the treatment model (classifier) should be a continuous variable (assigned probability of 0~1) rather than discrete (1 or 0), if at all (since it should essentially work as a propensity socre). If we leave the output as binary, we will lose a lot of information about the allocation tendency. [Example.] If T_i(observable) = 1:. [output is continuous] When P[T_i|X_i]=0.6, residuals: 1-0.6 = 0.4 When P[T_i|X_i]=0.9, residual: 1-0.9 = 0.1 [Output is binary] When P[T_i|X_i]=0.6, residual: 1-1 = 0 When P[T_i|X_i]=0.9, residuals: 1-1 = 0
And also If we make the treatment residuals discrete, there is a risk that their variance will be smaller than it needs to be. If the variance of the treatment is small, the variance of the final estimator will be large.
I made a sample code (some of it is in Japanese, sorry). I got the data from here (https://rdrr.io/cran/DoubleML/src/R/datasets.R).
e401k : Binary variables (intervention)
net_tfa : Outcome
t_res : Residuals of e401k
y_res : Residuals of net_tfa
t_res = e401k (observable 1or0) - first_t_model.predict_proab(estimated allocation probability)
@ShaileeImrith If your treatment is discrete, then you should pass discrete_treatment=True
to the DML initializer. Then we will expect the model_t to be a classification model, and as @MasaAsami says, we will use the predicted class probabilities (predict_proba
output of the classifier) rather than the actual class prediction (predict
of the classifier) when computing the residuals.
But if your question is merely out of curiosity, it's not clear that it would be "wrong" to use the discrete residuals of treating a classifier as if it were a regressor, as long as the assumptions of the DoubleML model are satisfied. In general, though, you'd expect the residuals to have higher variance than if you were using the predicted probabilities, which would probably lead to a noisier estimate.
Okay, thank you for your reply. Much appreciated:)
Thank you for writing this package; it's been extremely useful. I was wondering if it makes any difference if a classifier is used to predict D (model_d) instead of a regression model - my treatment variable is binary (0,1)? This would then mean that residuals_d will take only three values (0,1,-1). Is there any reason why one should still use a regression model for D?