atahk / pscl

Political Science Computational Laboratory
65 stars 14 forks source link

Zeroinfl may crash while computing SE.logtheta #3

Closed dstoeckel closed 5 years ago

dstoeckel commented 5 years ago

Hi, first of all thank you for the pscl package. I am using it directly and via mpath.

However, I have a numerical problem that pops up from time to time in the current CRAN version (1.5.2).

zeroinfl calls optim to minimize the log-likelihood of the zero-inflated model. It then uses the inverse of the inferred Hessian (Fisher Information matrix) to compute the standard error of theta. If the Hessian is singular, this crashes. However, the inferred Hessian can simply become singular due to numerical issues in the optimization procedure (method="BFGS"), which crashes the solve call:

Error in solve.default(as.matrix(fit$hessian)) :
  system is computationally singular: reciprocal condition number = 3.08817e-32

The problem disappears if

The following code reproduces the problem:

y <- c(8, 73, 14, 0, 0, 40, 2, 0, 0, 0, 11, 0, 0, 0, 0, 12, 2, 0, 0, 7, 0, 78, 0, 0, 20, 0, 11, 0, 0, 0, 0, 8, 0, 0, 0, 0, 4, 0,
0, 0, 0, 3, 0, 72, 50, 3, 0, 0, 0, 0, 9, 0, 0, 0, 0, 13, 0, 1, 2, 0, 0, 0, 4, 0, 0, 10, 30, 0, 0, 5, 0, 9, 0, 8, 0, 0, 0, 1,
0, 18, 0, 7, 40, 0, 0, 0, 11, 0, 7, 9, 3, 40, 10, 0, 4, 35, 11, 4, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 33, 10, 0, 3, 0, 0, 35, 0, 0,
0, 0, 5, 0, 0, 0, 0, 3, 0, 0, 16, 0, 0, 9, 15, 6, 7, 0, 0, 30, 0, 9, 0, 0, 0, 30, 10, 0, 0, 0, 0, 0, 0, 9, 12, 5, 31, 0, 0,
0, 2, 0, 0, 7, 20, 28, 4, 0, 0, 0, 0, 0, 0, 0, 0, 64, 0, 7, 0, 0, 0, 5, 0, 0, 0, 30, 0, 0, 0, 5, 0, 0, 0, 4, 0, 0, 0, 10, 0,
0, 0, 31, 0, 7, 0, 0, 1, 0, 7, 0, 28, 0, 6, 0, 0, 0, 0, 0, 2, 0, 15, 30, 0, 0, 0, 0, 0, 0, 0, 23, 5, 0, 0, 0, 5, 0, 0, 0, 0,
6, 15, 0, 3, 19, 0, 3, 0, 0, 50, 7, 0, 0, 4, 0, 0, 30, 0, 5, 0, 0, 0, 0, 0, 0, 0, 9, 56, 0, 0, 0, 2, 0, 0, 0, 24, 5, 0, 0,
0, 0, 0, 0, 0, 0, 0, 11, 0, 0, 5, 29, 0, 0, 0, 3, 0, 1, 28, 0, 0, 0, 0, 0, 0, 0, 0, 21, 0, 0, 8, 0, 26, 0, 0, 8, 0, 0, 25, 7,
0, 0, 0, 0, 1, 0, 0, 0, 0, 26, 8, 0, 0, 0, 1, 0, 0, 0, 36, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 36, 1, 0, 49, 0, 2, 0, 7,
0, 0, 3, 0, 0, 31, 9, 0, 0, 0, 0, 0, 0, 9, 0, 41, 0, 2, 6, 0, 0, 0, 4, 0, 0, 31, 4, 0, 0, 0, 0, 2, 0, 1, 55, 0, 20, 3, 0, 0,
0, 5, 0, 0, 31, 0, 2, 0, 0, 0, 2, 12, 0, 0, 0, 6, 15, 0, 0, 23, 0, 0, 8, 22, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 6, 0, 22,
0, 15, 0, 0, 0, 22, 6, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 22, 13, 0, 0, 0, 0, 5, 20, 22, 0, 0, 3, 10, 0, 0, 0, 0, 0, 2, 0,
7, 0, 0, 36, 3, 0, 0, 0, 0, 2, 36, 0, 0, 4, 0, 0, 8, 12, 0, 0, 0, 0, 6, 0, 0, 36, 0, 16, 0, 0, 0, 5, 36, 0, 0, 0, 0, 10, 0,
0, 0, 0, 0, 0, 0, 0, 0, 41, 35, 0, 0, 0, 0, 0, 35, 0, 1, 0, 0, 0, 5, 0, 0, 0, 4, 0, 0, 0, 26, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0,
5, 25, 0, 5, 0, 0, 0, 5, 0, 0, 0, 7, 25, 0, 0, 5, 0, 0, 0, 0, 4, 25, 0, 20, 8, 0, 0, 6, 0, 0, 0, 0, 0, 46, 9, 0, 0, 0, 0, 0,
0, 7, 0, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 50, 0, 13, 0, 0, 0, 0, 6, 30, 0, 0, 0, 0, 4, 0, 0, 0, 0, 6, 0, 30, 0, 30, 4,
0, 4, 0, 0, 30, 0, 4, 0, 0, 0, 0, 0, 0, 2, 0, 0, 30, 9, 0, 0, 0, 11, 0, 0, 0, 0, 34, 0, 0, 0, 0, 0, 0, 0, 25, 0, 0, 0, 0, 10,
0, 32, 0, 0, 3, 0, 0, 0, 0, 0, 5, 30, 0, 0, 0, 0, 1, 0)
fit <- pscl::zeroinfl(y~1|1, dist="negbin")

Proposed solutions:

atahk commented 5 years ago

Thanks for this!

The issue in your example seems to be that the Hessian actually is singular because optim is finding the "wrong" solution (a boundary solution that fits the large number of zeros with the count model). As a result, you can get it to perform properly by giving it better start values. For example, you can use the estimates of a hurdle model, which are much closer to the solution, for the start values. Thus, the following works without error for me: fit.hurdle <- pscl::hurdle(y~1|1, dist="negbin") fit <- pscl::zeroinfl(y~1|1, dist="negbin", start=fit.hurdle$coefficients)

Obviously, the zeroinfl function should handle this situation better. I don't think using L-BFGS-B or computing the Hessian differently are the best solutions here because the Hessian actually should be singular (although I think we might want to compute the Hessian differently anyway). But we should at least catch this error. Maybe we should issue a warning with that and suggest different start values? Or maybe the default should be to get better start values rather than using a vector of zeros (e.g., to get start values for the count side from a truncated count model and observations with non-zero responses, as happens with hurdle.

dstoeckel commented 5 years ago

Thank you for your answer! I agree that switching to L-BFGS-B or anything else as a default would only be a band aid for this particular case and not a solution in general. The proper solution is probably to get better starting values and (as this will make the problem only less likely) to catch a singular Hessian + issue a warning.

For my particular case I will have to see whether I can convince the mpath package to take different starting values (It is computing a whole LASSO path, so the probabilities may change a bit ...). But this is my problem :slightly_frowning_face:

simonjackman commented 5 years ago

thanks Alex, for weighing in on this

— Simon

On 19 Feb 2019, at 9:49 pm, Daniel Stöckel notifications@github.com<mailto:notifications@github.com> wrote:

Thank you for your answer! I agree that switching to L-BFGS-B or anything else as a default would only be a band aid for this particular case and not a solution in general. The proper solution is probably to get better starting values and (as this will make the problem only less likely) to catch a singular Hessian + issue a warning.

For my particular case I will have to see whether I can convince the mpath package to take different starting values (It is computing a whole LASSO path, so the probabilities may change a bit ...). But this is my problem 🙁

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://protect-au.mimecast.com/s/DpBPCJyp0qhWnxrmcVRUXh?domain=github.com, or mute the threadhttps://protect-au.mimecast.com/s/cQWNCK1qJZty1LEvFvsXnF?domain=github.com.

atahk commented 5 years ago

I added error handling with the Hessian and improved the starting values a bit. This at least fixes the example above (both in that it finds the correct solution with the new default starting values and it handles the singular Hessian properly if given the old starting values). It's also worked fine on all the examples I've tried. It's not on CRAN yet and should probably be tested a bit more first. But, until then, it can be installed from the github version using devtools.