lme4 / lme4

Mixed-effects models in R using S4 classes and methods with RcppEigen
Other
622 stars 146 forks source link

Errors in glmer : Error in eval(expr, envir, enclos) : cannot find valid starting > values: please specify some #634

Closed mrml500 closed 3 years ago

mrml500 commented 3 years ago

Hello, I have a model with an offset term where I am interested in the proportion of time individuals from two different groups spent in zone A as a proportion of time spent in either zone. To do this I have specified the model like so:

model <- glmer(time_at_a ~ group + offset(log(time_at_both)) + (1|id), 
               data = dataset, 
               family = gaussian(link="log"))

From my understanding, using the latter value as an offset effectively forces the model to estimate the actual time as a proportion of the total time spent in each zone, and using a log link preserves time as >0. However when I run this I get:

Error in eval(family$initialize, rho) : 
  cannot find valid starting values: please specify some

From looking at this thread I tried changing the mutstart values like so:

model <- glmer(time_at_a ~ group + offset(time_at_both) + (1|id), 
               data = dataset, 
               family = gaussian(link="log"), mustart=pmax(dataset$time_at_a,1e-3))

Error in (function (fr, X, reTrms, family, nAGQ = 1L, verbose = 0L, maxit = 100L,  : 
  Downdated VtV is not positive definite

But I get the "Downdated VtV is not positive definite" error. Adding "control=glmerControl(optimizer=c("Nelder_Mead","bobyqa"))" to the call did not make a difference.

Is there a way getting around this? I assume it is something to do with my data and using the log offset?

mrml500 commented 3 years ago

Here is the data if you would like to reproduce it:

dataset <- structure(list(id = c("NAT1", "NAT2", "NAT3", "NAT5", "NAT6", 
"NAT7", "NAT1", "NAT2", "NAT4", "NAT5", "NAT7", "NAT1", "NAT2", 
"NAT3", "NAT4", "NAT5", "NAT7", "NAT1", "NAT2", "NAT3", "NAT4", 
"NAT5", "NAT7", "NAT1", "NAT2", "NAT3", "NAT4", "NAT6", "NAT7", 
"NAT1", "NAT2", "NAT3", "NAT4", "NAT5", "NAT6", "NAT7", "NAT1", 
"NAT2", "NAT3", "NAT4", "NAT5", "NAT6", "NAT7", "NAT1", "NAT2", 
"NAT3", "NAT4", "NAT5", "NAT6", "NAT7", "NAT1", "NAT2", "NAT3", 
"NAT5", "NAT6", "NAT1", "NAT2", "NAT3", "NAT4", "NAT5", "NAT6", 
"NAT7", "NAT1", "NAT2", "NAT3", "NAT4", "NAT5", "NAT6", "NAT7", 
"NAT1", "NAT2", "NAT3", "NAT4", "NAT5", "NAT6", "NAT7", "NAT1", 
"NAT2", "NAT3", "NAT4", "NAT5", "NAT6", "NAT7", "NAT1", "NAT2", 
"NAT3", "NAT4", "NAT6", "NAT7", "NAT1", "NAT2", "NAT3", "NAT4", 
"NAT5", "NAT6", "NAT7", "GRA1", "GRA2", "GRA4", "GRA5", "GRA7", 
"GRA1", "GRA2", "GRA3", "GRA4", "GRA5", "GRA6", "GRA1", "GRA2", 
"GRA3", "GRA4", "GRA5", "GRA6", "GRA7", "GRA1", "GRA3", "GRA4", 
"GRA5", "GRA6", "GRA7", "GRA1", "GRA3", "GRA4", "GRA6", "GRA1", 
"GRA3", "GRA4", "GRA5", "GRA7", "GRA1", "GRA2", "GRA4", "GRA5", 
"GRA1", "GRA2", "GRA3", "GRA4", "GRA5", "GRA7", "GRA1", "GRA2", 
"GRA3", "GRA4", "GRA5", "GRA6", "GRA1", "GRA2", "GRA3", "GRA5", 
"GRA6", "GRA7", "GRA1", "GRA2", "GRA3", "GRA4", "GRA5", "GRA6", 
"GRA7", "GRA1", "GRA2", "GRA3", "GRA4", "GRA5", "GRA6", "GRA7", 
"GRA1", "GRA2", "GRA6", "GRA7", "GRA1", "GRA2", "GRA3", "GRA4", 
"GRA5", "GRA6", "GRA7", "GRA1", "GRA2", "GRA3", "GRA4", "GRA5", 
"GRA6", "GRA7"), date = c("31/01/2020", "31/01/2020", "31/01/2020", 
"31/01/2020", "31/01/2020", "31/01/2020", "03/02/2020", "03/02/2020", 
"03/02/2020", "03/02/2020", "03/02/2020", "04/02/2020", "04/02/2020", 
"04/02/2020", "04/02/2020", "04/02/2020", "04/02/2020", "05/02/2020", 
"05/02/2020", "05/02/2020", "05/02/2020", "05/02/2020", "05/02/2020", 
"06/02/2020", "06/02/2020", "06/02/2020", "06/02/2020", "06/02/2020", 
"06/02/2020", "07/02/2020", "07/02/2020", "07/02/2020", "07/02/2020", 
"07/02/2020", "07/02/2020", "07/02/2020", "10/02/2020", "10/02/2020", 
"10/02/2020", "10/02/2020", "10/02/2020", "10/02/2020", "10/02/2020", 
"11/02/2020", "11/02/2020", "11/02/2020", "11/02/2020", "11/02/2020", 
"11/02/2020", "11/02/2020", "12/02/2020", "12/02/2020", "12/02/2020", 
"12/02/2020", "12/02/2020", "13/02/2020", "13/02/2020", "13/02/2020", 
"13/02/2020", "13/02/2020", "13/02/2020", "13/02/2020", "14/02/2020", 
"14/02/2020", "14/02/2020", "14/02/2020", "14/02/2020", "14/02/2020", 
"14/02/2020", "17/02/2020", "17/02/2020", "17/02/2020", "17/02/2020", 
"17/02/2020", "17/02/2020", "17/02/2020", "18/02/2020", "18/02/2020", 
"18/02/2020", "18/02/2020", "18/02/2020", "18/02/2020", "18/02/2020", 
"19/02/2020", "19/02/2020", "19/02/2020", "19/02/2020", "19/02/2020", 
"19/02/2020", "20/02/2020", "20/02/2020", "20/02/2020", "20/02/2020", 
"20/02/2020", "20/02/2020", "20/02/2020", "31/01/2020", "31/01/2020", 
"31/01/2020", "31/01/2020", "31/01/2020", "03/02/2020", "03/02/2020", 
"03/02/2020", "03/02/2020", "03/02/2020", "03/02/2020", "04/02/2020", 
"04/02/2020", "04/02/2020", "04/02/2020", "04/02/2020", "04/02/2020", 
"04/02/2020", "05/02/2020", "05/02/2020", "05/02/2020", "05/02/2020", 
"05/02/2020", "05/02/2020", "06/02/2020", "06/02/2020", "06/02/2020", 
"06/02/2020", "07/02/2020", "07/02/2020", "07/02/2020", "07/02/2020", 
"07/02/2020", "10/02/2020", "10/02/2020", "10/02/2020", "10/02/2020", 
"11/02/2020", "11/02/2020", "11/02/2020", "11/02/2020", "11/02/2020", 
"11/02/2020", "12/02/2020", "12/02/2020", "12/02/2020", "12/02/2020", 
"12/02/2020", "12/02/2020", "13/02/2020", "13/02/2020", "13/02/2020", 
"13/02/2020", "13/02/2020", "13/02/2020", "14/02/2020", "14/02/2020", 
"14/02/2020", "14/02/2020", "14/02/2020", "14/02/2020", "14/02/2020", 
"17/02/2020", "17/02/2020", "17/02/2020", "17/02/2020", "17/02/2020", 
"17/02/2020", "17/02/2020", "18/02/2020", "18/02/2020", "18/02/2020", 
"18/02/2020", "19/02/2020", "19/02/2020", "19/02/2020", "19/02/2020", 
"19/02/2020", "19/02/2020", "19/02/2020", "20/02/2020", "20/02/2020", 
"20/02/2020", "20/02/2020", "20/02/2020", "20/02/2020", "20/02/2020"
), time_at_a = c(10.4, 16, 7.8, 2.2, 0, 14.8, 0, 8.4, 21, 25.2, 
6.4, 2.2, 7.8, 4, 0, 8.2, 74.4, 12, 21.6, 0, 1.6, 1, 28.4, 41, 
25, 28.8, 1.6, 0, 27.4, 3.6, 17.2, 42.2, 6.4, 0, 4.6, 101.6, 
10.6, 20.4, 0, 1.6, 1.8, 17.2, 5.2, 14, 25, 5, 0, 7.6, 2.6, 9.2, 
8.8, 3.8, 10.2, 3.6, 290.4, 3.4, 1.6, 17.2, 3, 7, 13, 8.4, 16.8, 
12.4, 23.2, 5.4, 3, 4.4, 30.2, 8.2, 2.4, 4.6, 6, 0, 1.2, 14.6, 
34, 3.2, 9.2, 10.6, 2.4, 64, 3.8, 4, 7.6, 11.6, 3.8, 5, 13.8, 
49.6, 6.6, 4.2, 4.4, 1.2, 12.6, 3.4, 12.6, 0, 0, 9.2, 3.2, 26.8, 
10.6, 8.4, 12.4, 5.8, 5.4, 38.6, 2.2, 2, 0, 0, 17.2, 11.4, 96.2, 
0.6, 0, 0, 4.6, 13.8, 20.6, 4.8, 8, 13.2, 115.6, 7, 13.6, 25.6, 
4.4, 6.2, 10, 0.6, 18.4, 10.2, 18.2, 25.4, 8.6, 6.2, 7.6, 2.2, 
15.8, 46.6, 6, 22.4, 3.6, 6.2, 2.2, 16, 127.2, 0, 39.2, 97.6, 
0.8, 0.8, 0, 3, 0.8, 4.6, 131, 7.2, 5, 0, 0, 0, 0, 15.6, 0, 6.6, 
0, 7, 1.8, 2.8, 10.6, 3, 3.4, 2.2, 59.4, 0, 10.2, 25.4, 4.8, 
0.6, 0), group = c("NAT", "NAT", "NAT", "NAT", "NAT", "NAT", 
"NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", 
"NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", 
"NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", 
"NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", 
"NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", 
"NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", 
"NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", 
"NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", 
"NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", 
"NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", "NAT", 
"GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", 
"GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", 
"GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", 
"GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", 
"GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", 
"GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", 
"GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", 
"GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", 
"GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", "GRA", 
"GRA", "GRA", "GRA", "GRA", "GRA", "GRA"), time_at_b = c(0, 3, 
0, 0, 1.2, 0, 3.2, 1.6, 23.6, 28.8, 0, 0, 2.8, 0, 1.4, 0, 2.2, 
2, 0, 5.4, 12.6, 0, 0, 0, 0, 0, 0, 1.2, 5.2, 0, 4.6, 0, 13.4, 
15.8, 0, 1.4, 9, 3.4, 4.4, 2, 7.4, 5, 11, 5.8, 2.8, 1.4, 4.8, 
0, 1.4, 6.8, 3.2, 0.6, 0, 15.4, 0, 1.2, 0, 11.8, 25.4, 0, 0, 
1.8, 1, 0, 0, 7, 2, 0, 0.6, 0.6, 0, 0, 11.2, 3.4, 0, 4.6, 0.8, 
19.8, 0, 24.2, 2.4, 1.6, 1.4, 0, 7, 2, 0, 0, 4.8, 0, 11.2, 0, 
6.6, 4.8, 0.8, 0, 0, 0, 2.8, 0, 0, 0.8, 3.6, 6.8, 0, 1.4, 0, 
3.4, 1.6, 5, 1.4, 2.2, 0, 6.6, 6.2, 1.2, 2, 4.8, 0, 0, 4.8, 2.8, 
11, 0, 2.4, 0, 0, 0, 7.4, 60.8, 0, 2.6, 1.6, 22.6, 0, 0.8, 11.4, 
1.2, 9.4, 10.4, 0, 0, 4, 0, 0.6, 11, 8.4, 2.4, 3.4, 3.2, 0, 2.4, 
3.4, 4, 6.8, 5, 0, 1.2, 0, 0.6, 9.2, 2.2, 2, 1.6, 3.8, 2.2, 3.8, 
20, 11.4, 2.6, 0, 3.2, 1.8, 2, 0, 5.4, 0.8, 11.6, 0, 2, 7.6, 
1.8, 8), time_at_both = c(10.4, 19, 7.8, 2.2, 1.2, 14.8, 3.2, 
10, 44.6, 54, 6.4, 2.2, 10.6, 4, 1.4, 8.2, 76.6, 14, 21.6, 5.4, 
14.2, 1, 28.4, 41, 25, 28.8, 1.6, 1.2, 32.6, 3.6, 21.8, 42.2, 
19.8, 15.8, 4.6, 103, 19.6, 23.8, 4.4, 3.6, 9.2, 22.2, 16.2, 
19.8, 27.8, 6.4, 4.8, 7.6, 4, 16, 12, 4.4, 10.2, 19, 290.4, 4.6, 
1.6, 29, 28.4, 7, 13, 10.2, 17.8, 12.4, 23.2, 12.4, 5, 4.4, 30.8, 
8.8, 2.4, 4.6, 17.2, 3.4, 1.2, 19.2, 34.8, 23, 9.2, 34.8, 4.8, 
65.6, 5.2, 4, 14.6, 13.6, 3.8, 5, 18.6, 49.6, 17.8, 4.2, 11, 
6, 13.4, 3.4, 12.6, 0, 2.8, 9.2, 3.2, 27.6, 14.2, 15.2, 12.4, 
7.2, 5.4, 42, 3.8, 7, 1.4, 2.2, 17.2, 18, 102.4, 1.8, 2, 4.8, 
4.6, 13.8, 25.4, 7.6, 19, 13.2, 118, 7, 13.6, 25.6, 11.8, 67, 
10, 3.2, 20, 32.8, 18.2, 26.2, 20, 7.4, 17, 12.6, 15.8, 46.6, 
10, 22.4, 4.2, 17.2, 10.6, 18.4, 130.6, 3.2, 39.2, 100, 4.2, 
4.8, 6.8, 8, 0.8, 5.8, 131, 7.8, 14.2, 2.2, 2, 1.6, 3.8, 17.8, 
3.8, 26.6, 11.4, 9.6, 1.8, 6, 12.4, 5, 3.4, 7.6, 60.2, 11.6, 
10.2, 27.4, 12.4, 2.4, 8)), row.names = c(182L, 211L, 243L, 303L, 
338L, 344L, 204L, 214L, 266L, 290L, 351L, 192L, 215L, 249L, 287L, 
293L, 340L, 180L, 207L, 261L, 281L, 307L, 342L, 176L, 205L, 236L, 
282L, 337L, 343L, 188L, 210L, 235L, 273L, 313L, 322L, 339L, 181L, 
208L, 262L, 283L, 305L, 317L, 353L, 179L, 206L, 245L, 288L, 294L, 
324L, 348L, 184L, 220L, 241L, 298L, 315L, 189L, 228L, 238L, 279L, 
296L, 318L, 349L, 178L, 212L, 237L, 275L, 300L, 323L, 341L, 185L, 
226L, 246L, 274L, 314L, 329L, 345L, 177L, 222L, 242L, 270L, 302L, 
316L, 356L, 187L, 216L, 240L, 278L, 321L, 346L, 175L, 218L, 248L, 
277L, 306L, 319L, 357L, 12L, 49L, 102L, 109L, 166L, 8L, 34L, 
60L, 81L, 112L, 135L, 7L, 41L, 71L, 98L, 125L, 132L, 155L, 4L, 
75L, 99L, 126L, 136L, 154L, 10L, 65L, 86L, 133L, 2L, 61L, 80L, 
106L, 164L, 17L, 35L, 97L, 108L, 15L, 31L, 56L, 85L, 111L, 159L, 
25L, 32L, 55L, 88L, 107L, 137L, 18L, 42L, 57L, 105L, 145L, 153L, 
3L, 45L, 73L, 100L, 117L, 142L, 163L, 1L, 37L, 63L, 101L, 127L, 
146L, 169L, 11L, 47L, 134L, 170L, 16L, 43L, 68L, 84L, 118L, 138L, 
167L, 6L, 48L, 58L, 79L, 114L, 143L, 171L), class = "data.frame")
mmaechler commented 3 years ago

Hello, I have a model with an offset term where I am interested in the proportion of time individuals from two different groups spent in zone A as a proportion of time spent in either zone. To do this I have specified the model like so:

model <- glmer(time_at_a ~ group + offset(log(time_at_both)) + (1|id), 
               data = dataset, 
               family = gaussian(link="log"))

From my understanding, using the latter value as an offset effectively forces the model to estimate the actual time as a proportion of the total time spent in each zone, and using a log link preserves time as >0. However when I run this I get:

Error in eval(family$initialize, rho) : 
  cannot find valid starting values: please specify some

From looking at this thread I tried changing the mutstart values like so:


model <- glmer(time_at_a ~ group + offset(time_at_both) + (1|id), 
               data = dataset, 
               family = gaussian(link="log"), mustart=pmax(dataset$time_at_a,1e-3))

Error in (function (fr, X, reTrms, family, nAGQ = 1L, verbose = 0L, maxit = 100L,  : 
  Downdated VtV is not positive definite

but the 2nd time you used offset(time_at_both) whereas originally you had - for a good reason I assume - offset( log(time_at_both) ) -- could this be the culprit?

mrml500 commented 3 years ago

Hello, I have a model with an offset term where I am interested in the proportion of time individuals from two different groups spent in zone A as a proportion of time spent in either zone. To do this I have specified the model like so:

model <- glmer(time_at_a ~ group + offset(log(time_at_both)) + (1|id), 
               data = dataset, 
               family = gaussian(link="log"))

From my understanding, using the latter value as an offset effectively forces the model to estimate the actual time as a proportion of the total time spent in each zone, and using a log link preserves time as >0. However when I run this I get:

Error in eval(family$initialize, rho) : 
  cannot find valid starting values: please specify some

From looking at this thread I tried changing the mutstart values like so:

model <- glmer(time_at_a ~ group + offset(time_at_both) + (1|id), 
               data = dataset, 
               family = gaussian(link="log"), mustart=pmax(dataset$time_at_a,1e-3))

Error in (function (fr, X, reTrms, family, nAGQ = 1L, verbose = 0L, maxit = 100L,  : 
  Downdated VtV is not positive definite

but the 2nd time you used offset(time_at_both) whereas originally you had - for a good reason I assume - offset( log(time_at_both) ) -- could this be the culprit?

You are correct and this was a mistake, but actually it is a different error I get when I run it like so:

model <- glmer(time_at_a ~ group + offset(log(time_at_both)) + (1|id), 
               data = dataset, 
               family = gaussian(link="log"), mustart=pmax(dataset$time_at_a,1e-3))

Error in (function (fr, X, reTrms, family, nAGQ = 1L, verbose = 0L, maxit = 100L,  : 
  (maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate
bbolker commented 3 years ago

You have zero values in your offset term, so logging gives NaN values. Try this:

minval <- min(dataset$time_at_both[dataset$time_at_both>0])
fdata <- transform(dataset, time_at_both = pmax(minval/2, time_at_both))
model <-  glmer(time_at_a ~ group + offset(log(time_at_both)) + (1|id),
               data = fdata,
               family = gaussian(link="log"),
               mustart=pmax(dataset$time_at_a,1e-3))

Having to force zero values to be non-zero so you can log-transform them could indicate a problem with your data and/or your model that shouldn't be swept under the rug ...