atahk / pscl

Political Science Computational Laboratory
65 stars 14 forks source link

Bug fixes for `pscl` #17

Closed zmeers closed 1 year ago

zmeers commented 1 year ago

This PR solves two bugs in pscl.

(1) In hurdle.R and zeroinfl.R, some matrices have not specified the drop = FALSE argument. As documented in an open GitHub issue (https://github.com/atahk/pscl/issues/16), an error occurs when we have 1x1 matrix. Because we do not specify the drop = FALSE argument, R returns a numeric value. This later messes up the downstream matrix calculations in the function. Adding drop = FALSE solves any downstream issues.

(2) Secondly, another error (https://github.com/atahk/pscl/issues/11) was noted in the GitHub repo whereby when users set the control to an EM algo within a zero-inflated count model, they received the following error: object 'model_count' not found. This error occurs because the model_count object was defined within an if statement in L310 of zeroinfl.R, where the convergence of the stats::optim() function must be greater than 0. But docs for stats::optim note that 0 means the result converged and a result greater than 0 corresponds to a particular warning. So it seems like this is a coding error:

L.310 in zeroinfl.R:

if(fit$convergence > 0)

(fit was the assigned name of the optim function).

We would ideally expect the results of optim$convergence to be 0 because, as noted above, this means that the result converged. So L310 has been changed to if(fit$convergence == 0)).

Then the following if() statements can be changed to also specify that the fit$converge must also equal zero:

if(ocontrol$EM & dist == "poisson")
if(ocontrol$EM & dist == "geometric")
if(ocontrol$EM & dist == "negbin")

should become

if(ocontrol$EM & dist == "poisson" & fit$convergence == 0)
if(ocontrol$EM & dist == "geometric" & fit$convergence == 0)
if(ocontrol$EM & dist == "negbin" & fit$convergence == 0)

Because if(fit$convergence > 0) becomes if(fit$convergence == 0), zero-inflated model with an EM algorithm now work as expected.

It also seems like the general convergence issue is addressed further on in the function with this warning:

if(fit$convergence > 0) warning("optimization failed to converge")

So long as fit$convergence == 0, the model_count object should exist. If not, the warning will return. When running the following example, the proposed changes work as expected: fm2 <- zeroinfl(art ~ ., data = bioChemists, EM = TRUE).

In summary:

simonjackman commented 1 year ago

Thanks Zoe. Thanks Alex. Hope all is well!!!