Closed lemna closed 7 years ago
Hi Peter,
You need to use the latest build of survHE
, because changes in INLA
sort of broke the fit.models
implementation (long story short: Håvard is doing some work on the survival analysis implementation of INLA
(prompted by discussion he and I had), but there's a slight issue in the current implementation for the AFT model. So I have provisionally fixed it by modifying how fit.models
deals with it).
I've tested using example(fit.models)
on my Windows VM and all seems to work OK!
This is under the latest build for both INLA and survHE (see versions above). I've reinstalled both to make sure, and the error is still there. example(fit.models)
works fine, it only affects distr = "weibull"
.
edit: the same happens with the INLA testing version (0.0.1484140175).
mmm... Interestingly, this is a data-dependent error. If you try
library(survHE)
dat <- read.table("http://www.statistica.it/gianluca/survHE/data.txt",header=TRUE)
m <- fit.models(Surv(time,censored)~arm,dat,"weibull","inla")
things work OK. Also, with the bc
dataset, the WeibullPH model does work in INLA
. And if you try
m <- fit.models(formula=Surv(recyrs,censrec)~1,data=bc,"weibull","inla")
it also works.
I think this can be fixed for this particular dataset by fiddling with the INLA
prior specification. For example, if you try
m0 <- fit.models(formula=Surv(recyrs,censrec)~group,data=bc,"weibull")
m1 <- fit.models(formula=Surv(recyrs,censrec)~group,data=bc,"weibull","inla",control.fixed=list(prec=4))
m2 <- fit.models(formula=Surv(recyrs,censrec)~group,data=bc,"weibull","hmc")
it does run and gives results that are in line with the MLE or the HMC estimates:
print(m0);print(m1);print(m2)
Model fit for the Weibull model, obtained using Flexsurvreg
(Maximum Likelihood Estimate). Running time: 0.021 seconds
mean se L95% U95%
shape 1.379652 0.0667876 1.254769 1.516964
scale 11.422862 1.2728396 9.181767 14.210965
groupMedium -0.613589 0.1269014 -0.862311 -0.364867
groupPoor -1.212214 0.1255721 -1.458330 -0.966097
Model fitting summaries
Akaike Information Criterion (AIC)....: 1631.884
Bayesian Information Criterion (BIC)..: 1650.007
Model fit for the Weibull model, obtained using INLA (Bayesian inference via
Integrated Nested Laplace Approximation). Running time: 1.6147 seconds
mean se L95% U95%
shape 1.355011 0.0572663 1.209933 1.441900
scale 10.767422 1.0857830 8.891241 12.829036
groupMedium -0.545085 0.1225832 -0.787892 -0.333495
groupPoor -1.132202 0.1120192 -1.326781 -0.950920
Model fitting summaries
Akaike Information Criterion (AIC)....: 1632.201
Bayesian Information Criterion (BIC)..: 1649.558
Deviance Information Criterion (DIC)..: 1632.201
Model fit for the WeibullAF model, obtained using Stan (Bayesian inference via
Hamiltonian Monte Carlo). Running time: 8.8777 seconds
mean se L95% U95%
shape 1.366712 0.0657198 1.242369 1.498928
scale 11.722601 1.3039931 9.518683 14.562417
groupMedium -0.624543 0.1286569 -0.885266 -0.390754
groupPoor -1.229494 0.1234262 -1.479752 -0.993615
Model fitting summaries
Akaike Information Criterion (AIC)....: 1633.946
Bayesian Information Criterion (BIC)..: 1656.600
Deviance Information Criterion (DIC)..: 1631.817
(notice however that the priors may have some influence in INLA
here...).
Great! This is something that needs to be fixed by the INLA developers then: a failure to converge should lead to an error or a warning, not to a crash of R.
Thanks for looking into this, Peter.
Well --- may be there's something that INLA
can do (and I think there may be some updates coming soon to deal with a slight imprecision in the WeibullAF model). But it's also about how the model is fitted and what the underlying data are. So ultimately, this is the responsibility of the modeller to check that the assumptions/options are reasonable/justifiable.
If anything, I think there must be a big effort in understanding how the Bayesian model works and how one can tweak INLA
and rstan
to one's needs...
I've seen your other problem - will look into it! Thanks!!
Sorry, but what I mean is that R should not crash, even if a user selects unrealistic priors or other options. This will cause users to lose their work and they will need to rerun all steps up to the model. Instead, INLA should return a warning or an error indicating that something went wrong.
If nothing else, it should be fixed for your own sake. Otherwise you'll get error reports every time somebody's model does not converge.
Well... I think we can ask, but I am not sure I disagree in principle with the INLA
approach. It's "dangerous" to run a model and don't realise how bad things have gone... I think people may not take too seriously warnings, often; crashing INLA
doesn't necessarily cause you to lose all your work - in fact, you'll probably won't fit those models you've run with the call to fit.models(...,"inla")
. But this is not terrible and it gives a sense of the strength of the problem (ie if you don't know how to solve it, you probably shouldn't use INLA
unsupervised...).
Does this make sense?
Fitting a nonsense model should lead to an error stopping the execution of the user's script, not to a complete crash. Something like this:
Error: INLA did not converge. Check the model specification, priors and modeling settings.
This will inform the user that they made a mistake, not the authors of INLA or survHE. If an error message like this doesn't stop you in your tracks, you deserve whatever happens.
The current behavior (R crash) will lead to users reporting this over and over again to you. Or worse, users losing faith in the package and not using it.
edit: correction this is not a R crash, but an INLA crash. It doesn't give helpful information under windows, it pops up three dialog boxes, suggesting that there is a bug instead of a misspecified model:
first one:
Microsoft Visual C++ Runtime Library. This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
The second and third dialog are the windows application crash dialogs, inviting the user to send debugging information to MS.
You can easily see what will happen once more users are confronted with this kind of dialogs.
I am not sure under Windows, but under Linux, INLA
suggests using the option verbose=T
, which does give some more info (admittedly, if you know how to read it...).
Now, you can't do that at the moment in survHE::fit.models
, but this can be implemented...
That would be a quick fix. If you decide to go this route the manual should clearly explain that misspecified models can/will crash INLA.
Consider also how your target users will react: in a company one person might write a template analysis script; less sophisticated users will then change variable names to make it work for the project at hand and try to run it, without giving model specification much thought.
I think support for the verbose
argument is a quick short-term solution, while in the long run INLA should handle non-convergence more gracefully.
Updated with verbose
option.
Hi Gianluca,
I found this while working on the roxygen setup:
under both windows 32 and 64 fitting a Weibull model with INLA crashes R. For example:
The error messages appear after RStudio restarts the R session. In my own setup all other distributions work fine.
Any idea? Peter.