chjackson / flexsurv

The flexsurv R package for flexible parametric survival and multi-state modelling
http://chjackson.github.io/flexsurv/
54 stars 28 forks source link

Error in Gompertz fit, what does this mean? #54

Open Calou1234 opened 6 years ago

Calou1234 commented 6 years ago

Dear chjackson,

I am using a lot these days flexsurv, great package! However with one data, I am receiving an error using the following line of code: GompertzOS<-flexsurvreg(Surv(OS, OS_Cens)~1, data=Test, dist= "gompertz")

Error reported: "Error in optim(method = "BFGS", par = c(shape = 0.001, rate = -6.47675255851469 : initial value in 'vmmin' is not finite"

I do not understand this error. I was trying to find any zeros in min/max/mean, but I could not find it.

My input data: Test.xlsx

Thanks a lot for your help.

Best Regards.

chjackson commented 6 years ago

It looks like the hazard is decreasing for a large portion of the follow-up, so the best fitting shape parameter is negative, and positive shape parameters give zero likelihood. I managed to get it to fit using

GompertzOS<-flexsurvreg(Surv(OS, OS_Cens)~1, data=Test, dist= "gompertz", inits=c(-1, 1/mean(Test$OS))) plot(GompertzOS, type="hazard")

Though plotting the hazard shows that it is increasing at the latter stage, so the monotonic shape of the Gompertz does not represent the data well.

I'll leave this issue open because the package needs a more clever way of finding good initial values for this distribution. Thanks for the report.

ApexHeel commented 5 years ago

Setting the initial values seems to fix some of the problems (same error) I'm also having with the Gompertz. I'm running into similar errors fitting splines, always with the scale="normal" where I get the error "non-finite finite-difference value [3]"

Any guidance on initial values to try for that? For the same data set (sorry I can't share as it is confidential) models with knots 1 through 4 fit fine and then knots 5 and 6 have the error. It looks like when k=5 I need to set the starting values for 7 parameters.

It just isn't increasing knots that have the error, but it's always scale="normal". I've had situations where k=1 produces the error, but k=2 and higher is fine. Then others where k=1 is fine, k=2 through k=4 produces the error and then k=5 and k=6 are fine. All of these are subgroups within the larger data set.

Thanks in advance!

chjackson commented 5 years ago

Could you use the estimates from the models that work to initialise the models that don't work? For example using the estimates from scale="hazard" to initialise models with scale="normal".

It may just be that the model doesn't fit the data well, or there's not enough information to inform the type of curve you are trying to fit.

Another approach would be to go Bayesian, using the survHE package. Using weak prior information, and Hamiltonian Monte Carlo, may stabilise estimation, or demonstrate more clearly that there's not enough data so the results are dominated by the prior.