melff / mclogit

mclogit: Multinomial Logit Models, with or without Random Effects or Overdispersion
http://melff.github.io/mclogit/
22 stars 4 forks source link

Error when using nested random terms #21

Closed AidanCox12 closed 2 years ago

AidanCox12 commented 2 years ago

Hello and thank you for developing this package. I am running into a persistent error message while trying to construct a mixed effect, baseline-category multinomial logit and hope you can provide some insight.

I am trying to construct a model which describes how different factors influence the likelihood of several behavioral states among two species of shark. 19 individual animals, comprising two distinct species, were each tracked for ~100 days each and a different behavioral state was identified for each day data were collected. I would like to code individual shark as a categorical random effect variable (with 19 levels) within species, a categorical fixed effect variable (with 2 levels).

With this general idea, the code that I am currently trying to run is: mclogit::mblogit(cluster ~ 1, random = ~1|species/individual, data = df)

Which produces the error message:

Error in solve.default(X[[i]], ...) : 'a' (6 x 1) must be square

This error message is only produced in models where 'species' is included as a random effect and not when species is included as a fixed effect or individual is included along. This suggests that the error is in how the function is processing the 'species' variable specifically.

Here is a dummy sample of the data on which I am trying to train the model:

df <- data.frame(Date = c("2015-11-25", "2016-01-24", "2016-02-27", "2016-03-27", "2017-12-02", "2017-12-06", "2015-10-30", "2015-10-31", "2009-01-28", "2009-02-04", "2009-02-15", "2009-02-20", "2009-02-21", "2009-02-23"), cluster = factor(c(3,3,4,6,3,1,3,2, 2, 4, 3, 3, 3, 4)),species = factor(c("I.oxyrinchus", "I.oxyrinchus", "I.oxyrinchus", "I.oxyrinchus", "P.glauca", "P.glauca", "P.glauca", "P.glauca", "I.oxyrinchus", "I.oxyrinchus", "I.oxyrinchus", "P.glauca", "P.glauca", "P.glauca")),individual = factor(c("141257", "141257", "141254", "141254", "141256", "141256", "141255", "141255", "78680", "78682", "78683", "78680", "78682", "78683")))

Similar to the whole dataset, this model runs this code: mclogit::mblogit(cluster ~ 1, random = ~1|individual, data = df) But fails to run these: mclogit::mblogit(cluster ~ 1, random = ~1|species/individual, data = df) mclogit::mblogit(cluster ~ 1, random = ~1|species, data = df)

Interestingly, with the factor variable 'individual' has fewer distinct levels, it produces the same error message. Perhaps the breakpoint exists in how the function handles random variables with few discrete levels.

After following the error traceback the breakpoint seems to originate from:

lapply(Phi.start, solve) > PQLMQL_innerFit(y.star, X, Z, W, d, offset, method, estimator, control) > mmclogit.fitPQLMQL(y = Y, s = s, w = weights, X = XD, Z = ZD, d = d, method = method, estimator = estimator, control = control, offset = offset)

Based on a little digging, it seems that the error may lie in the object Z.

Can you please provide any more information about what is producing this error message and how I might work around it? Thank you very much for your time and assistance.

Cheers

melff commented 2 years ago

This appears to be fixed in release 0.9.4.