DoubleML / doubleml-for-r

DoubleML - Double Machine Learning in R
https://docs.doubleml.org
Other
123 stars 25 forks source link

Categorical D #125

Closed hhsievertsen closed 2 years ago

hhsievertsen commented 2 years ago

Hi, thanks for developing this! this might be a silly question, but would it be possible for D to be categorical? Best, Hans

MalteKurz commented 2 years ago

Hi Hans,

thanks for your interest in our package.

A binary D is no problem. For this you can use the IRM as well as the PLR model (https://docs.doubleml.org/stable/guide/models.html#interactive-regression-model-irm). In both models, you then need to specify a machine learning classifiers to estimate the ml_m / m_0 nuisance function / propensity score.

If your categorical treatment variable comes with more than two categories, things are getting a bit more complicate. A possible approach is described in "Robust inference on average treatment effects with possibly more covariates than observations" (https://www.sciencedirect.com/science/article/abs/pii/S0304407615001864 / https://arxiv.org/pdf/1309.4686.pdf). My guess would be that the described estimation approach should in some sense be achievable with the DoubleML package (after doing some special data pre-processing, transformation, etc steps).

However, right now we don't have a convenient API such that a user can easily estimate a DoubleML model with a categorical treatment variable D. I will put this on our list of feature request and hope that we will find time to work on it. We will keep you posted in this issue.

Best, Malte

royzawadzki commented 2 years ago

Hello,

I'm having some trouble with the PLIV on R. It doesn't appear to support binary treatments as it doesn't let you do a classifier for ml_r. Am I doing something wrong here?

Thanks, Roy

hhsievertsen commented 2 years ago

Hi Hans,

thanks for your interest in our package.

A binary D is no problem. For this you can use the IRM as well as the PLR model (https://docs.doubleml.org/stable/guide/models.html#interactive-regression-model-irm). In both models, you then need to specify a machine learning classifiers to estimate the ml_m / m_0 nuisance function / propensity score.

If your categorical treatment variable comes with more than two categories, things are getting a bit more complicate. A possible approach is described in "Robust inference on average treatment effects with possibly more covariates than observations" (https://www.sciencedirect.com/science/article/abs/pii/S0304407615001864 / https://arxiv.org/pdf/1309.4686.pdf). My guess would be that the described estimation approach should in some sense be achievable with the DoubleML package (after doing some special data pre-processing, transformation, etc steps).

However, right now we don't have a convenient API such that a user can easily estimate a DoubleML model with a categorical treatment variable D. I will put this on our list of feature request and hope that we will find time to work on it. We will keep you posted in this issue.

Best, Malte

Thanks for the response Malte! That is very useful. And sorry that I forgot to reply and close the issue. Cheers, Hans