Causal-LDA / TrialEmulation

https://causal-lda.github.io/TrialEmulation/
Apache License 2.0
20 stars 7 forks source link

allow parsnip models for weight estimation #209

Open gravesti opened 2 weeks ago

gravesti commented 2 weeks ago

We could use tidymodels + parsnip for allowing all kinds of models for IPCW estimation.

lisu-stats commented 2 weeks ago

@gravesti Shaun and I will look into highly adaptive LASSO in this paper (https://onlinelibrary.wiley.com/doi/abs/10.1111/biom.13719)

gravesti commented 2 weeks ago

@lisu-stats Excellent. This parsnip integration is almost ready

gravesti commented 1 week ago

@lisu-stats Regarding HAL, I see there is a hal9001 package. It should be simple enough to support.

I have now merged the parsnip functionality, so we have access to many models: https://parsnip.tidymodels.org/articles/Examples.html

You can recreate the standard glm method:

 trial_sequence("PP") |>
   set_data(data = data_censored) |>
   set_switch_weight_model(
     numerator = ~ age_s + x1 + x3,
     denominator = ~ x3 + x4,
     model_fitter = parsnip_model(
      parsnip::logistic_reg() |> set_mode("classification") |> set_engine("glm"),
      tempdir()
    )
   ) |> calculate_weights()

Or something more exotic like MARS


library(earth)
mars_spec <- mars(prod_degree = 1, prune_method = "backward") %>% 
    set_mode("classification") %>% 
    set_engine("earth")

  parsnip_model(mars_spec, tempdir())
lisu-stats commented 1 week ago

Thanks @gravesti Yes, hal9001 is the package we can use for HAL. I will discuss with Shaun about an undersmoothed version of HAL which is more suitable to weight estimation.

For parsnip models, will the specific libraries be automatically loaded? or users have to do it manually.

gravesti commented 1 week ago

@lisu-stats From my testing it seems they will actually be loaded automatically at fitting time. It also seems that many of the modelling packages are installed when you install parsnip.

lisu-stats commented 6 days ago

Thanks @gravesti Using data-adaptive methods to estimate weights can generate problems when drawing inference, e.g. using bootstrap. However, people are doing this in practice... so I don't know.

The undersmoothed HAL has some theoretical support for their validity to estimate IPTW weights. Alternatively, double robust methods such as this one https://www.jstatsoft.org/article/view/v081i01/1153 can be used, but it is much more sophisticated and is not available to fit general MSMs yet.