chjackson / flexsurv

The flexsurv R package for flexible parametric survival and multi-state modelling
http://chjackson.github.io/flexsurv/
53 stars 28 forks source link

Inconsistent standard errors when rescaling continuous covariates #162

Closed JoshVeasy closed 11 months ago

JoshVeasy commented 1 year ago

Hi

First off sorry if this a dumb question and I'm missing something obvious. I've been developing models in both flexsurvreg and flexsurvspline that include continuous covariates. Due to the size of the continuous covariate, I have on occasion re scaled it by say dividing by 1000. Whilst the parameter estimates remain consistent (taking into account the change of scale), I have been getting quite different SEs for the parameter estimates and thus in turn different p-values and hence inconsistent results for the significance of said covariate. I have tested this scenario using survreg and this inconsistency does not arise there. I have also tried using both Nelson-Mead and BFGS for the optimization approach and get the same problem.

I am happy to provide some code if needs but the data I am using cannot be shared.

Best, Josh

chjackson commented 1 year ago

SEs for the basic parameters in flexsurv (e.g. log covariate effects) are based on estimates of the Hessian (matrix of second derivatives of the log-likelihood) at the MLE. That involves numerical differentiation. For some of the models in flexsurv there are first derivatives available, but none of them have second derivatives. I think the models in survreg have analytic formulae for the first and second derivatives, so I'd expect the SEs to be more accurate than flexsurv for the same model. I'd also expect the numerical methods for estimating the Hessian to be more accurate if all the parameters are on a similar scale.

chjackson commented 11 months ago

Created new issue to implement analytic second derivatives #170