DoubleML / doubleml-for-r

DoubleML - Double Machine Learning in R
https://docs.doubleml.org
Other
121 stars 27 forks source link

[Feature Request]: Classification Learner for `ml_l` #185

Open simonschoe opened 1 year ago

simonschoe commented 1 year ago

Describe the feature you want to propose or implement

Currently, ml_m allows for a classfication learner while ml_l does not. What is the rationale behind this choice? It could easily be the case that both the treatment D and the outcome Y are binary variables in which case it is desirable to use classification learners on both stages.

Propose a possible solution or implementation

No response

Did you consider alternatives to the proposed solution. If yes, please describe

No response

Comments, context or references

No response

PhilippBach commented 9 months ago

Hello @simonschoe ,

thanks for opening this issue and apologies for the late response. To which of the causal models does your feature request apply? Could you maybe give a brief example?

simonschoe commented 9 months ago

Sorry, there was a typo in the initial issue description. Essentially, I was trying to implement a generalized linear model (logit/poisson) on the first and second stage. What I realized is: ml_l (second stage) only allows for a LearnerRegr and not for a LearnerClassif while ml_m allows for both.

Say my treatment $D$ is binary and my outcome $Y$ is also a binary variable. In such a setting, I would like to use LearnerClassif for both ml_m and ml_l.

PhilippBach commented 9 months ago

Thank you! Indeed, there is a paper (and code) available for logistic regression with Double Machine Learning. However, we haven't had time to implement it yet: https://arxiv.org/abs/2009.14461

We have the model on the list of our planned extensions, but it's hard to say when this will be implemented. In case you want to contribute it, feel free to have a look at https://github.com/DoubleML/doubleml-for-r/blob/main/CONTRIBUTING.md

Otherwise I think we only have the regression learner for the outcome variable right now. I'm not sure, if the PLR would work with a classifier for the main regression, but maybe you could test it in some simulation... Technically, I think it would be only necessary to adjust this line of code to also accept classifiers for ml_l https://github.com/DoubleML/doubleml-for-r/blob/ba452aba8f6c71df2da560e6d009fcb565079dd6/R/double_ml_plr.R#L179

PhilippBach commented 9 months ago

In case $Y$ and $D$ are both binary, you could also consider an IRM model, which works with classifiers for both