Closed shakirarodzlan closed 1 year ago
flexsurv fits the standard Weibull by wrapping survreg
from the survival
package. It looks like survreg
successfully converged and produced estimates of the Weibull shape and scale (276 and 26) that are extreme, which leads to numerical overflow in flexsurvreg
. I'd guess that the Weibull distribution just doesn't fit this dataset very well, or the shape parameter is not identifiable from the data (i.e many shape parameters would fit equally well so there is no "optimum").
Dear chjackson,
Thank you for your prompt response.
I wanted to inform you that I explored an alternative package for running the Weibull model, and I am pleased to report that it provided successful results.
1) eha package
library(eha)
weibull.mod <- weibreg(Surv(survival_time, status_pmcvd == "premature cvd death") ~ sex + age_entry + ethnic + education_level +dm_status + hpt_status + chol_status + abdominal_obesity + current_smoker, data = pmcvd2) weibull.mod
Call: weibreg(formula = Surv(survival_time, status_pmcvd == "premature cvd death") ~ sex + age_entry + ethnic + education_level + dm_status + hpt_status + chol_status + abdominal_obesity + current_smoker, data = pmcvd2)
Covariate Mean Coef Exp(Coef) se(Coef) Wald p sex Female 0.547 0 1 (reference) Male 0.453 0.832 2.298 0.095 0.000 age_entry 39.945 0.037 1.037 0.004 0.000 ethnic Chinese 0.196 0 1 (reference) Indian 0.084 0.570 1.768 0.153 0.000 Malay 0.583 0.558 1.747 0.112 0.000 Other Bumiputeras 0.113 0.416 1.515 0.162 0.011 Others 0.024 -0.038 0.963 0.350 0.914 education_level No formal education 0.062 0 1 (reference) Primary 0.246 0.210 1.234 0.145 0.148 Secondary 0.530 0.037 1.037 0.155 0.813 Tertiary 0.162 -0.592 0.553 0.216 0.006 dm_status Normal 0.872 0 1 (reference) Diagnosed DM 0.065 1.183 3.265 0.096 0.000 Undiagnosed DM 0.063 0.427 1.532 0.130 0.001 hpt_status Normal 0.650 0 1 (reference) Diagnosed HPT 0.125 0.836 2.306 0.108 0.000 Undiagnosed HPT 0.226 0.554 1.741 0.097 0.000 chol_status Normal 0.684 0 1 (reference) Diagnosed chol 0.062 0.243 1.275 0.116 0.036 Undiagnosed chol 0.254 0.310 1.363 0.085 0.000 abdominal_obesity No 0.549 0 1 (reference) Yes 0.451 0.256 1.291 0.083 0.002 current_smoker No 0.767 0 1 (reference) Yes 0.233 0.462 1.587 0.094 0.000
log(scale) 7.976 2911.544 0.252 0.000 log(shape) 0.355 1.426 0.033 0.000
Events
Total time at risk 625686
Max. log. likelihood -5159.5
LR test statistic 1043
Degrees of freedom 17
Overall p-value 0
2) survival package library(survival) wei.mod <-survreg(Surv(survival_time, status_pmcvd == "premature cvd death") ~ sex + age_entry + ethnic + education_level +dm_status + hpt_status + chol_status + abdominal_obesity + current_smoker, data = pmcvd2, dist='weibull') summary(wei.mod) Call: survreg(formula = Surv(survival_time, status_pmcvd == "premature cvd death") ~ sex + age_entry + ethnic + education_level + dm_status + hpt_status + chol_status + abdominal_obesity + current_smoker, data = pmcvd2, dist = "weibull") Value Std. Error z p (Intercept) 2.60e+01 8.39e-79 3.10e+79 <2e-16 sexMale 5.54e+00 3.07e-79 1.80e+79 <2e-16 age_entry 4.09e-01 1.18e-80 3.47e+79 <2e-16 ethnicIndian 3.37e+00 5.24e-79 6.43e+78 <2e-16 ethnicMalay 5.18e+00 3.34e-79 1.55e+79 <2e-16 ethnicOther Bumiputeras 1.37e+00 4.83e-79 2.84e+78 <2e-16 ethnicOthers 1.46e+01 7.85e-79 1.86e+79 <2e-16 education_levelPrimary -7.66e+00 5.70e-79 -1.35e+79 <2e-16 education_levelSecondary -8.49e+00 5.74e-79 -1.48e+79 <2e-16 education_levelTertiary -5.24e+00 6.36e-79 -8.23e+78 <2e-16 dm_statusDiagnosed DM 1.22e+01 5.27e-79 2.32e+79 <2e-16 dm_statusUndiagnosed DM 2.55e+00 4.91e-79 5.20e+78 <2e-16 hpt_statusDiagnosed HPT -3.24e+00 4.41e-79 -7.35e+78 <2e-16 hpt_statusUndiagnosed HPT -4.24e+00 3.26e-79 -1.30e+79 <2e-16 chol_statusDiagnosed chol 2.44e+00 5.35e-79 4.56e+78 <2e-16 chol_statusUndiagnosed chol 7.81e+00 2.90e-79 2.69e+79 <2e-16 abdominal_obesityYes 4.46e-01 2.73e-79 1.64e+78 <2e-16 current_smokerYes 3.84e-01 3.61e-79 1.06e+78 <2e-16 Log(scale) -2.76e+02 0.00e+00 -Inf <2e-16
Scale= 1.07e-120
Weibull distribution Loglik(model)= 54887.5 Loglik(intercept only)= -5680.7 Chisq= 121136.4 on 17 degrees of freedom, p= 0 Number of Newton-Raphson Iterations: 2 n= 54483
Could you please help me determine whether the issue is with my data or with the package being used?
I really appreciate this flexsurv package because the results it provides for different distributions (like exponential, Gompertz, log normal, and log logistic) are clear and easy to understand. I hope I can also use it for the Weibull distribution to make it comparable with other types of analyses.
Thank you
weibreg
from eha
is a proportional hazards Weibull model. flexsurvreg(...,dist="weibull")
and the Weibull model survreg
are accelerated failure time Weibull models. You can get the proportional hazards model in flexsurv with dist="weibullPH"
. Though given that the implementation goes via survreg, a similar problem might happen. You could work around this by supplying the estimates you get from eha
as initial values to flexsurvreg
. But whatever package and model you use, you should check whether the fitted model fits the data well and is scientifically plausible.
Thank you very much for your detailed explanation and advice regarding the Weibull models. I'll definitely keep in mind the importance of thoroughly assessing the fitted model's fit to the data and its scientific validity, regardless of the package or model I choose to use. Your guidance has been extremely helpful!
Hi, I received this error after run flexsurvreg () using dist = "weibull".
wei_mod <- flexsurvreg(Surv(survival_time, status_pmcvd == "premature cvd death") ~ sex + age_entry + ethnic + education_level + dm_status + hpt_status + chol_status + abdominal_obesity + current_smoker, data = pmcvd2, dist = "weibull") print(wei_mod) Error in optim(method = "BFGS", par = c(shape = 276.244558817693, scale = 26.0046238740812, : initial value in 'vmmin' is not finite
I tried add initial = 0, but I still got an error.
wei_mod <- flexsurvreg(Surv(survival_time, status_pmcvd == "premature cvd death") ~ sex + age_entry + ethnic + education_level + dm_status + hpt_status + chol_status + abdominal_obesity + current_smoker, data = pmcvd2, dist = "weibull", inits = '0') print(wei_mod)
Error in flexsurvreg(Surv(survival_time, status_pmcvd == "premature cvd death") ~ : Initial values for parameters 1,2 out of range
However, I manage to run other distribution options using the same code, e.g., exponential, Gompertz, log normal, and log logistic.
Example from Gompertz survival model;
gomp_mod <- flexsurvreg(Surv(survival_time, status_pmcvd == "premature cvd death") ~ sex + age_entry + ethnic + education_level +dm_status + hpt_status + chol_status + abdominal_obesity + current_smoker, data = pmcvd2, dist = "gompertz") print(gomp_mod) Call: flexsurvreg(formula = Surv(survival_time, status_pmcvd == "premature cvd death") ~ sex + age_entry + ethnic + education_level + dm_status + hpt_status + chol_status + abdominal_obesity + current_smoker, data = pmcvd2, dist = "gompertz")
Estimates: data mean est L95% U95% se exp(est) L95%
shape NA 8.40e-02 6.72e-02 1.01e-01 8.60e-03 NA NA rate NA 1.77e-05 1.01e-05 3.11e-05 5.10e-06 NA NA sexMale 4.64e-01 8.39e-01 6.53e-01 1.02e+00 9.46e-02 2.31e+00 1.92e+00 age_entry 4.07e+01 3.70e-02 2.95e-02 4.46e-02 3.84e-03 1.04e+00 1.03e+00 ethnicIndian 8.25e-02 5.71e-01 2.71e-01 8.71e-01 1.53e-01 1.77e+00 1.31e+00 ethnicMalay 5.94e-01 5.62e-01 3.41e-01 7.82e-01 1.12e-01 1.75e+00 1.41e+00 ethnicOther Bumiputeras 1.08e-01 4.23e-01 1.05e-01 7.41e-01 1.62e-01 1.53e+00 1.11e+00 ethnicOthers 2.93e-02 -2.00e-02 -7.05e-01 6.65e-01 3.49e-01 9.80e-01 4.94e-01 education_levelPrimary 2.46e-01 2.09e-01 -7.50e-02 4.93e-01 1.45e-01 1.23e+00 9.28e-01 education_levelSecondary 5.15e-01 3.85e-02 -2.64e-01 3.41e-01 1.54e-01 1.04e+00 7.68e-01 education_levelTertiary 1.77e-01 -5.77e-01 -1.00e+00 -1.53e-01 2.16e-01 5.62e-01 3.68e-01 dm_statusDiagnosed DM 7.68e-02 1.19e+00 1.00e+00 1.38e+00 9.60e-02 3.29e+00 2.72e+00 dm_statusUndiagnosed DM 7.14e-02 4.40e-01 1.86e-01 6.95e-01 1.30e-01 1.55e+00 1.20e+00 hpt_statusDiagnosed HPT 1.35e-01 8.34e-01 6.23e-01 1.05e+00 1.08e-01 2.30e+00 1.86e+00 hpt_statusUndiagnosed HPT 2.24e-01 5.49e-01 3.59e-01 7.39e-01 9.69e-02 1.73e+00 1.43e+00 chol_statusDiagnosed chol 7.29e-02 2.58e-01 3.10e-02 4.85e-01 1.16e-01 1.29e+00 1.03e+00 chol_statusUndiagnosed chol 2.90e-01 3.21e-01 1.53e-01 4.89e-01 8.56e-02 1.38e+00 1.17e+00 abdominal_obesityYes 4.71e-01 2.60e-01 9.68e-02 4.23e-01 8.33e-02 1.30e+00 1.10e+00 current_smokerYes 2.36e-01 4.62e-01 2.78e-01 6.45e-01 9.35e-02 1.59e+00 1.32e+00 U95%
shape NA rate NA sexMale 2.78e+00 age_entry 1.05e+00 ethnicIndian 2.39e+00 ethnicMalay 2.19e+00 ethnicOther Bumiputeras 2.10e+00 ethnicOthers 1.94e+00 education_levelPrimary 1.64e+00 education_levelSecondary 1.41e+00 education_levelTertiary 8.58e-01 dm_statusDiagnosed DM 3.97e+00 dm_statusUndiagnosed DM 2.00e+00 hpt_statusDiagnosed HPT 2.85e+00 hpt_statusUndiagnosed HPT 2.09e+00 chol_statusDiagnosed chol 1.62e+00 chol_statusUndiagnosed chol 1.63e+00 abdominal_obesityYes 1.53e+00 current_smokerYes 1.91e+00
N = 54483, Events: 737, Censored: 53746 Total time at risk: 625686.1 Log-likelihood = -5162.153, df = 19 AIC = 10362.31
Kindly help me to encounter this error.
Thank you