Correct Multiple Imputation Standard Error Estimation

RobinDenz1 / adjustedCurves

An R-Package to estimate and plot confounder-adjusted survival curves (single event survival data) and confounder-adjusted cumulative incidence functions (data with competing risks) using various methods.

https://robindenz1.github.io/adjustedCurves/

GNU General Public License v3.0

35 stars 4 forks source link

Correct Multiple Imputation Standard Error Estimation #30

Closed jackmwolf closed 3 months ago

jackmwolf commented 3 months ago

Both adjustedsurv and adjustedcif support multiple imputation via mids objects. However, the current variance estimator is not consistent and does not use Rubin's rules. Both functions currently estimate the standard error of the survival function as the average of the within imputation standard error estimates

se=mean(se, na.rm=mi_extrapolation)

This does not account for variance in point estimates between imputations and is not averaged on the variance scale. Instead, the variance should be estimated as $$\hat{V} = \hat{V}_W + \hat{V}_B + \frac{\hat{V}_B}m$$ where $\hat{V}_W$ is an estimate of the within imputation variance (the average of the variance estimates), $\hat{V}_B$ is an estimate of the between imputation variance (the variance of the imputed point estimates), and $m$ is the number of imputations. (See Chapter 9 of Applied Missing Data Analysis by Heymans and Eekhout.)

RobinDenz1 commented 3 months ago

Thank you very much for finding and fixing this error. I will admit that I was ignorant of the correct way to do this. May I ask, how did you notice?

jackmwolf commented 3 months ago

Happy to contribute! I've been using the package a lot lately and really appreciate all the functionality! As for finding the issue, I was digging through the adjustedcif source trying to make sense of an error I was getting with multiple imputations plus bootstrapping and happened to notice the issue. Thanks for the super quick turnaround!

jackmwolf commented 3 months ago

@RobinDenz1 I was digging throuhg my old code trying to recall the error I was investigating and realized that my PR did not address using Rubin's rules with bootstrapped data. I've made a new PR (#32) with this fix included. Sorry for not catching this before you prepared for (and submitted?) to CRAN!

RobinDenz1 commented 3 months ago

I see. If you can show me a reproducible example of your error I might be able to help you with that.

Sadly I have already submitted the changes to CRAN, but thats not a big deal, I will include it in the next update. Should have checked it as well before submitting, so its more on me than on you.

jackmwolf commented 3 months ago

Revisiting that project, it looks like it was a misunderstanding of the documentation on my part. I had specified conf_int = TRUE and bootstrap = TRUE for a method that didn't support asymptotic standard errors assuming that I needed both arguments to be specified as true for bootstrapped CI's. The documentation is clear on this so I think all is good here!

Thanks again!

RobinDenz1 commented 3 months ago

Don't worry, you are not the first person to make that mistake. It is documented, yes, but it is surely not the best design choice. I have some major refactoring of internal code planned, because (as you might have noticed) it is kinda messy right now. I may change how this works in future versions to make it more intuitive.