Closed tomwenseleers closed 1 year ago
Thanks - I will look into it as soon as I can.
I also experience this error quite frequently. Here is a reprex (data uploaded here, vector of observation weights uploaded here):
dat <- read.csv("dat.csv", stringsAsFactors = TRUE)
w_obs <- read.csv("w_obs.csv")[[1]]
# Default fit (does not work):
mb_fit <- mclogit::mblogit(
formula = y ~ xco.1 + xco.2 + xco.3 + xca.1 + xca.2,
data = dat,
random = ~ xco.1 | z.1,
weights = w_obs,
model = FALSE,
y = FALSE
)
## --> Gives:
#
# Iteration 1 - deviance = 80.89403 - criterion = 0.9530748
# Iteration 2 - deviance = 73.66568 - criterion = 0.08980789
# Iteration 3 - deviance = 71.81153 - criterion = 0.03756291
# Iteration 4 - deviance = 69.95513 - criterion = 0.02145767
# Iteration 5 - deviance = 68.71881 - criterion = 0.01894345
# Iteration 6 - deviance = 69.63039 - criterion = 0.0154936
# Iteration 7 - deviance = 68.70349 - criterion = 0.01088839
# Iteration 8 - deviance = 68.69328 - criterion = 0.006974858
# Iteration 9 - deviance = 67.80436 - criterion = 0.007406875
# Iteration 10 - deviance = 69.87305 - criterion = 0.007387896
# Iteration 11 - deviance = 68.42121 - criterion = 0.00535707
# Iteration 12 - deviance = 65.89278 - criterion = 0.01301447
# Iteration 13 - deviance = 55.60609 - criterion = 0.02348028
# Iteration 14 - deviance = 46.98855 - criterion = 0.03045705
# Iteration 15 - deviance = 43.4102 - criterion = 0.03987452Error in solve.default(X[[i]], ...) :
# system is computationally singular: reciprocal condition number = 1.87698e-19
##
# Modified fit 1 (works, but increasing `epsilon` is probably cheating):
mb_fit <- mclogit::mblogit(
formula = y ~ xco.1 + xco.2 + xco.3 + xca.1 + xca.2,
data = dat,
random = ~ xco.1 | z.1,
weights = w_obs,
model = FALSE,
y = FALSE,
epsilon = 1e-1
)
## --> Gives:
#
# Iteration 1 - deviance = 80.89403 - criterion = 0.9530748
# Iteration 2 - deviance = 73.66568 - criterion = 0.08980789
# converged
##
# Modified fit 2 (works, but "Algorithm stopped due to false convergence"):
mb_fit <- mclogit::mblogit(
formula = y ~ xco.1 + xco.2 + xco.3 + xca.1 + xca.2,
data = dat,
random = ~ xco.1 | z.1,
weights = w_obs,
model = FALSE,
y = FALSE,
avoid.increase = TRUE
)
## --> Gives:
# Stepsize halved - new deviance = 66.79285
# Stepsize halved - new deviance = 63.6818
# Stepsize halved - new deviance = 62.89986
# Stepsize halved - new deviance = 62.70058
# Stepsize halved - new deviance = 62.64825
# Stepsize halved - new deviance = 62.63382
# Stepsize halved - new deviance = 62.62953
# Stepsize halved - new deviance = 62.62811
# Stepsize halved - new deviance = 62.62758
# Stepsize halved - new deviance = 62.62736
# Stepsize halved - new deviance = 62.62727
# Stepsize halved - new deviance = 62.62722
# Stepsize halved - new deviance = 62.6272
# Stepsize halved - new deviance = 62.62719
# Stepsize halved - new deviance = 62.62718
# Stepsize halved - new deviance = 62.62718
#
# Iteration 1 - deviance = 62.62718 - criterion = 4.682507e-09
# converged
# Warning messages:
# 1: step size truncated due to possible divergence
# 2: Algorithm stopped due to false convergence
##
As mentioned in the inline code comments, the "Modified fit 1" works, but increasing epsilon
is probably cheating. So I wouldn't favor that solution. The "Modified fit 2" works as well, but the warning Algorithm stopped due to false convergence
probably indicates that convergence is given neither (although it says converged
directly above the warnings). So I wonder if there is some "tuning setup" that the user could choose to achieve convergence. If not, would it be possible to change the underlying code in the mclogit package to make the algorithm converge in such cases?
If I am wrong and "Modified fit 2" is actually converging (despite the warning Algorithm stopped due to false convergence
), then perhaps that warning message Algorithm stopped due to false convergence
could be formulated differently?
Thanks in advance!
Thanks for the replication material. It allowed me to test whether the recent revisions work.
Your code
mb_fit <- mclogit::mblogit(
formula = y ~ xco.1 + xco.2 + xco.3 + xca.1 + xca.2,
data = dat,
random = ~ xco.1 | z.1,
weights = w_obs,
model = FALSE,
y = FALSE
)
mb_fit
results in
mclogit::mblogit(formula = y ~ xco.1 + xco.2 + xco.3 + xca.1 +
xca.2, data = dat, random = ~xco.1 | z.1, weights = w_obs,
model = FALSE, y = FALSE)
Coefficients:
Predictors
Response categories (Intercept) xco.1 xco.2 xco.3 xca.1lvl2 xca.1lvl3 xca.2lvl2
y2/y1 -3.39830 1.25987 -0.27576 3.85874 1.68203 -9.16883 -6.88392
y3/y1 -0.71716 0.02455 0.31685 0.66200 0.67266 -0.51303 -0.55805
(Co-)Variances:
Grouping level: z.1
y2~1 y3~1 y2~xco.1 y3~xco.1
y2~1 141.1776
y3~1 2.2635 5.9994
y2~xco.1 -15.6402 -11.7495 111.0851
y3~xco.1 -0.2452 -0.2933 2.0815 3.4029
Null Deviance: 90.09
Residual Deviance: 43.41
Note: Algorithm did not converge.
and
summary(mb_fit)
gives
Call:
mclogit::mblogit(formula = y ~ xco.1 + xco.2 + xco.3 + xca.1 +
xca.2, data = dat, random = ~xco.1 | z.1, weights = w_obs,
model = FALSE, y = FALSE)
Equation for y2 vs y1:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.3983 7.7883 -0.436 0.6626
xco.1 1.2599 5.3750 0.234 0.8147
xco.2 -0.2758 1.3850 -0.199 0.8422
xco.3 3.8587 2.2415 1.722 0.0852 .
xca.1lvl2 1.6820 9.7339 0.173 0.8628
xca.1lvl3 -9.1688 586.6832 -0.016 0.9875
xca.2lvl2 -6.8839 4.3594 -1.579 0.1143
Equation for y3 vs y1:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.71716 1.44264 -0.497 0.619
xco.1 0.02455 0.93892 0.026 0.979
xco.2 0.31685 0.42296 0.749 0.454
xco.3 0.66200 0.56547 1.171 0.242
xca.1lvl2 0.67266 1.71939 0.391 0.696
xca.1lvl3 -0.51303 1.78021 -0.288 0.773
xca.2lvl2 -0.55805 1.59196 -0.351 0.726
(Co-)Variances:
Grouping level: z.1
Estimate Std.Err.
y2~1 141.1776 151.080
y3~1 2.2635 5.9994 43.480 19.578
y2~xco.1 -15.6402 -11.7495 111.0851 117.552 52.455 175.070
y3~xco.1 -0.2452 -0.2933 2.0815 3.4029 27.315 7.929 26.910 6.487
Null Deviance: 90.09
Residual Deviance: 43.41
Number of Fisher Scoring iterations: 16
Number of observations
Groups by z.1: 6
Individual observations: 41
Note: Algorithm did not converge.
The large estimates and even larger standard errors suggest that the data shows separation in the dependent variable. I should add that with 41 observations overall and 6 groups it would be very surprising to get stable results. With model that has 22 parameters you are asking quite a lot from data with just 41 observations.
Thanks for your reply.
I should have mentioned that this is not a real-world example, but rather simulated data from projpred's unit tests (I am planning to use the mclogit package within projpred). It might well be possible that the simulated data is too extreme so that separability occurs.
What I was unsure about was rather the potential contradiction between the warning Algorithm stopped due to false convergence
and the printed output converged
directly above the warnings. But if I understand your comment here correctly, then the printed output converged
may be due to the value of the objective function not changing much anymore and the warning Algorithm stopped due to false convergence
then indicates that the parameter estimates are still changing nonetheless (going to +/- infinity). Did I get that correctly?
I am frequently encountering fitting errors in
mblogit
, returning me the error"Error in solve.default(Information) : system is computationally singular: reciprocal condition number"
for models that run OK innnet::multinom
. Would you happen to have any idea on how to resolve these? A reproducible example is given below. It could be that there is some problem with complete separation / some Hauck-Donner effect, but given that it happens so frequently, also in situations where the same fit innnet::multinom
runs OK I have the feeling there might be some bug inmblogit
. Either way, any advice on how to get rid of this error would be welcome! In logistic regression I recall that one can add a small amount of noise to zero counts or add some ridge regularisation (e.g. by row augmenting the covariate matrix with a diagonal matrix with sqrt(lambda) with lambda=ridge penalty) to get rid of this sort of problems - I don't know if this could be a solution also for multinomial regression?As you can see this fit returns unusually large standard errors, and increasing
maxit
to>30
results in the errorIt could be that this is due to complete separation / some Hauck-Donner effect, but given that this model runs OK in
nnet::multinom
I am tempted to think it could rather be a bug inmblogit
.