Open mccarthy-m-g opened 1 year ago
Here's my attempt at translating the SAS code from the previous comment. The website also has nlme code examples but they're a bit messy and they forgot to centre read
in Model B.
The estimates and plots are similar to those reported in the textbook, but they're not the same. I'm not sure if this is a discrepancy between nlme and SAS, or if the nlme syntax here is wrong in a sneaky way.
There isn't much discussion on the internet comparing proc nlmixed
and nlme::nlme()
, and I don't have an SAS license to do any testing. The only discussion I could find was this Stack Exchange question.
I'm also not sure if it's possible to specify these models using the deriv()
approach (see later examples).
# remotes::install_github("mccarthy-m-g/alda")
library(alda)
library(dplyr)
library(tidyr)
library(ggplot2)
library(nlme)
# Model A ---------------------------------------------------------------------
# Model fit
cognitive_growth_fit_A <- nlme(
nmoves ~ 1 + 19 / (1 + g00 * exp(-(g10*game + u1*game + u0))),
fixed = g00 + g10 ~ 1,
random = u0 + u1 ~ 1 | id,
start = c(g00 = 12, g10 = .2),
data = cognitive_growth
)
# Table 6.6
summary(cognitive_growth_fit_A)
#> Nonlinear mixed-effects model fit by maximum likelihood
#> Model: nmoves ~ 1 + 19/(1 + g00 * exp(-(g10 * game + u1 * game + u0)))
#> Data: cognitive_growth
#> AIC BIC logLik
#> 2493.691 2518.28 -1240.846
#>
#> Random effects:
#> Formula: list(u0 ~ 1, u1 ~ 1)
#> Level: id
#> Structure: General positive-definite, Log-Cholesky parametrization
#> StdDev Corr
#> u0 0.67804063 u0
#> u1 0.07502427 -0.821
#> Residual 3.67947376
#>
#> Fixed effects: g00 + g10 ~ 1
#> Value Std.Error DF t-value p-value
#> g00 10.741638 2.3508160 427 4.569323 0
#> g10 0.113036 0.0201006 427 5.623501 0
#> Correlation:
#> g00
#> g10 0.817
#>
#> Standardized Within-Group Residuals:
#> Min Q1 Med Q3 Max
#> -2.6168128 -0.5637287 -0.1259902 0.5704647 3.5385195
#>
#> Number of Observations: 445
#> Number of Groups: 17
# Figure 6.10
n_games <- tibble(game = seq(from = 0, to = 30, by = 0.1))
n_moves <- tibble(nmoves = predict(cognitive_growth_fit_A, n_games, level = 0))
n_games |>
bind_cols(n_moves) |>
ggplot(aes(x = game, y = nmoves)) +
geom_line() +
coord_cartesian(ylim = c(0, 25))
# Model B ---------------------------------------------------------------------
# Model fit
cognitive_growth_fit_B <- nlme(
nmoves ~
1 +
19 / (1 + g00 * exp(-(g10*game + g11*read*game + u1*game + g01*read + u0))),
fixed = g00 + g01 + g10 + g11 ~ 1,
random = u0 + u1 ~ 1 | id,
start = c(g00 = 12, g01 = 0, g10 = .2, g11 = 0),
data = mutate(
cognitive_growth,
read = reading_score - mean(reading_score)
)
)
# Table 6.6
summary(cognitive_growth_fit_B)
#> Nonlinear mixed-effects model fit by maximum likelihood
#> Model: nmoves ~ 1 + 19/(1 + g00 * exp(-(g10 * game + g11 * read * game + u1 * game + g01 * read + u0)))
#> Data: mutate(cognitive_growth, read = reading_score - mean(reading_score))
#> AIC BIC logLik
#> 2495.652 2528.436 -1239.826
#>
#> Random effects:
#> Formula: list(u0 ~ 1, u1 ~ 1)
#> Level: id
#> Structure: General positive-definite, Log-Cholesky parametrization
#> StdDev Corr
#> u0 0.61431317 u0
#> u1 0.06864452 -0.783
#> Residual 3.68126962
#>
#> Fixed effects: g00 + g01 + g10 + g11 ~ 1
#> Value Std.Error DF t-value p-value
#> g00 10.784471 2.2457344 425 4.802202 0.0000
#> g01 -0.331551 0.2724946 425 -1.216727 0.2244
#> g10 0.113178 0.0187545 425 6.034731 0.0000
#> g11 0.036791 0.0245775 425 1.496930 0.1352
#> Correlation:
#> g00 g01 g10
#> g01 -0.047
#> g10 0.793 -0.029
#> g11 0.030 -0.796 0.018
#>
#> Standardized Within-Group Residuals:
#> Min Q1 Med Q3 Max
#> -2.5946647 -0.5869675 -0.1253522 0.5680446 3.5759966
#>
#> Number of Observations: 445
#> Number of Groups: 17
# Figure 6.10
n_games <- crossing(
game = seq(from = 0, to = 30, by = 0.1), read = c(-1.58, 1.58)
)
n_moves <- tibble(nmoves = predict(cognitive_growth_fit_B, n_games, level = 0))
n_games |>
bind_cols(n_moves) |>
mutate(read = factor(read)) |>
ggplot(aes(x = game, y = nmoves, colour = read)) +
geom_line() +
coord_cartesian(ylim = c(0, 25))
Created on 2023-05-27 with reprex v2.0.2
Here's my attempt at translating the textbook equations from the previous comment.
There are two things that really stand out here:
g00.read
in Model BThese lead me to think the models are misspecified, but I can't figure out where since I was under the impression this was the correct syntax to add fixed covariates in nlme()
.
# remotes::install_github("mccarthy-m-g/alda")
library(alda)
library(dplyr)
library(tidyr)
library(ggplot2)
library(nlme)
# Model A ---------------------------------------------------------------------
# Model fit
cognitive_growth_fit_A <- nlme(
nmoves ~ 1 + 19 / (1 + (g00 + u0) * exp(-(g10*game + u1*game))),
fixed = g00 + g10 ~ 1,
random = u0 + u1 ~ 1 | id,
start = c(g00 = 12, g10 = .2),
data = cognitive_growth
)
# Table 6.6
summary(cognitive_growth_fit_A)
#> Nonlinear mixed-effects model fit by maximum likelihood
#> Model: nmoves ~ 1 + 19/(1 + (g00 + u0) * exp(-(g10 * game + u1 * game)))
#> Data: cognitive_growth
#> AIC BIC logLik
#> 2496.188 2520.777 -1242.094
#>
#> Random effects:
#> Formula: list(u0 ~ 1, u1 ~ 1)
#> Level: id
#> Structure: General positive-definite, Log-Cholesky parametrization
#> StdDev Corr
#> u0 5.0897079 u0
#> u1 0.0737091 0.999
#> Residual 3.7361864
#>
#> Fixed effects: g00 + g10 ~ 1
#> Value Std.Error DF t-value p-value
#> g00 13.907570 1.7910177 427 7.765177 0
#> g10 0.123399 0.0191151 427 6.455578 0
#> Correlation:
#> g00
#> g10 0.862
#>
#> Standardized Within-Group Residuals:
#> Min Q1 Med Q3 Max
#> -2.83524865 -0.50639168 -0.09850201 0.58666787 3.48010074
#>
#> Number of Observations: 445
#> Number of Groups: 17
# Figure 6.10
n_games <- tibble(game = seq(from = 0, to = 30, by = 0.1))
n_moves <- tibble(nmoves = predict(cognitive_growth_fit_A, n_games, level = 0))
n_games |>
bind_cols(n_moves) |>
ggplot(aes(x = game, y = nmoves)) +
geom_line() +
coord_cartesian(ylim = c(0, 25))
# Model B ---------------------------------------------------------------------
# Model fit
cognitive_growth_fit_B <- nlme(
nmoves ~ 1 + 19 / (1 + (g00 + u0) * exp(-(g10*game + u1*game))),
fixed = g00 + g10 ~ read,
random = u0 + u1 ~ 1 | id,
start = c(g00 = 12, 0, g10 = .2, 0),
data = mutate(
cognitive_growth,
read = reading_score - mean(reading_score)
)
)
# Table 6.6
summary(cognitive_growth_fit_B)
#> Nonlinear mixed-effects model fit by maximum likelihood
#> Model: nmoves ~ 1 + 19/(1 + (g00 + u0) * exp(-(g10 * game + u1 * game)))
#> Data: mutate(cognitive_growth, read = reading_score - mean(reading_score))
#> AIC BIC logLik
#> 2494.505 2527.29 -1239.253
#>
#> Random effects:
#> Formula: list(u0 ~ 1, u1 ~ 1)
#> Level: id
#> Structure: General positive-definite, Log-Cholesky parametrization
#> StdDev Corr
#> u0 4.0428445 u0
#> u1 0.0687585 1
#> Residual 3.7172681
#>
#> Fixed effects: g00 + g10 ~ read
#> Value Std.Error DF t-value p-value
#> g00.(Intercept) 15.462773 2.3429479 425 6.599708 0.0000
#> g00.read 8.741346 2.8298049 425 3.089028 0.0021
#> g10.(Intercept) 0.125456 0.0184091 425 6.814906 0.0000
#> g10.read 0.051436 0.0230951 425 2.227148 0.0265
#> Correlation:
#> g00.(I g00.rd g10.(I
#> g00.read 0.725
#> g10.(Intercept) 0.705 0.242
#> g10.read 0.135 0.630 0.030
#>
#> Standardized Within-Group Residuals:
#> Min Q1 Med Q3 Max
#> -2.82513633 -0.54193632 -0.09486528 0.64639888 3.54988788
#>
#> Number of Observations: 445
#> Number of Groups: 17
# Figure 6.10
n_games <- crossing(
game = seq(from = 0, to = 30, by = 0.1), read = c(-1.58, 1.58)
)
n_moves <- tibble(nmoves = predict(cognitive_growth_fit_B, n_games, level = 0))
n_games |>
bind_cols(n_moves) |>
mutate(read = factor(read)) |>
ggplot(aes(x = game, y = nmoves, colour = read)) +
geom_line() +
coord_cartesian(ylim = c(0, 25))
Created on 2023-05-27 with reprex v2.0.2
deriv()
approachAn alternative way to specify these models is with the deriv()
approach, but again it seems these models are misspecified.
# remotes::install_github("mccarthy-m-g/alda")
library(alda)
library(dplyr)
library(tidyr)
library(ggplot2)
library(nlme)
# deriv() approach ------------------------------------------------------------
# Parameters:
# - g00: "intercept"
# - g10: "slope"
# Predictors:
# - t: time
change_trajectory <- deriv(
~ 1 + (19 / (1 + g00 * exp(-g10 * t))),
namevec = c("g00", "g10"),
function.arg = c("t", "g00", "g10")
)
# Model A ---------------------------------------------------------------------
# Model fit
cognitive_growth_fit_A <- nlme(
nmoves ~ change_trajectory(game, g00, g10),
fixed = g00 + g10 ~ 1,
random = g00 + g10 ~ 1 | id,
start = c(g00 = 12, g10 = .2),
data = cognitive_growth
)
# Table 6.6
summary(cognitive_growth_fit_A)
#> Nonlinear mixed-effects model fit by maximum likelihood
#> Model: nmoves ~ change_trajectory(game, g00, g10)
#> Data: cognitive_growth
#> AIC BIC logLik
#> 2496.188 2520.776 -1242.094
#>
#> Random effects:
#> Formula: list(g00 ~ 1, g10 ~ 1)
#> Level: id
#> Structure: General positive-definite, Log-Cholesky parametrization
#> StdDev Corr
#> g00 5.08924022 g00
#> g10 0.07370742 0.999
#> Residual 3.73618068
#>
#> Fixed effects: g00 + g10 ~ 1
#> Value Std.Error DF t-value p-value
#> g00 13.906389 1.7909643 427 7.764749 0
#> g10 0.123395 0.0191148 427 6.455451 0
#> Correlation:
#> g00
#> g10 0.862
#>
#> Standardized Within-Group Residuals:
#> Min Q1 Med Q3 Max
#> -2.83521810 -0.50639166 -0.09854295 0.58660648 3.48012223
#>
#> Number of Observations: 445
#> Number of Groups: 17
# Figure 6.10
n_games <- tibble(game = seq(from = 0, to = 30, by = 0.1))
n_moves <- tibble(nmoves = predict(cognitive_growth_fit_A, n_games, level = 0))
n_games |>
bind_cols(n_moves) |>
ggplot(aes(x = game, y = nmoves)) +
geom_line() +
coord_cartesian(ylim = c(0, 25))
# Model B ---------------------------------------------------------------------
# Model fit
cognitive_growth_fit_B <- nlme(
nmoves ~ change_trajectory(game, g00, g10),
fixed = g00 + g10 ~ read,
random = g00 + g10 ~ 1 | id,
start = c(g00 = 12, 0, g10 = .2, 0),
data = mutate(
cognitive_growth,
read = reading_score - mean(reading_score)
)
)
# Table 6.6
summary(cognitive_growth_fit_B)
#> Nonlinear mixed-effects model fit by maximum likelihood
#> Model: nmoves ~ change_trajectory(game, g00, g10)
#> Data: mutate(cognitive_growth, read = reading_score - mean(reading_score))
#> AIC BIC logLik
#> 2494.506 2527.29 -1239.253
#>
#> Random effects:
#> Formula: list(g00 ~ 1, g10 ~ 1)
#> Level: id
#> Structure: General positive-definite, Log-Cholesky parametrization
#> StdDev Corr
#> g00.(Intercept) 4.04301232 g00.(I
#> g10.(Intercept) 0.06875899 1
#> Residual 3.71726759
#>
#> Fixed effects: g00 + g10 ~ read
#> Value Std.Error DF t-value p-value
#> g00.(Intercept) 15.462897 2.3429742 425 6.599687 0.0000
#> g00.read 8.741198 2.8299608 425 3.088805 0.0021
#> g10.(Intercept) 0.125456 0.0184091 425 6.814913 0.0000
#> g10.read 0.051436 0.0230954 425 2.227098 0.0265
#> Correlation:
#> g00.(I g00.rd g10.(I
#> g00.read 0.725
#> g10.(Intercept) 0.705 0.242
#> g10.read 0.135 0.630 0.030
#>
#> Standardized Within-Group Residuals:
#> Min Q1 Med Q3 Max
#> -2.82513976 -0.54192847 -0.09486709 0.64640084 3.54989292
#>
#> Number of Observations: 445
#> Number of Groups: 17
# Figure 6.10
n_games <- crossing(
game = seq(from = 0, to = 30, by = 0.1), read = c(-1.58, 1.58)
)
n_moves <- tibble(nmoves = predict(cognitive_growth_fit_B, n_games, level = 0))
n_games |>
bind_cols(n_moves) |>
mutate(read = factor(read)) |>
ggplot(aes(x = game, y = nmoves, colour = read)) +
geom_line() +
coord_cartesian(ylim = c(0, 25))
Created on 2023-05-27 with reprex v2.0.2
It's also possible to use one of the existing nonlinear equations in R, like SSlogis()
(the SS stands for self-starting), which is defined as follows (according to its help page):
Asym / (1 + exp((xmid - input) / scal))
The estimates for this model are not similar to those of the website or textbook, but the figures are closer in shape.
# remotes::install_github("mccarthy-m-g/alda")
library(alda)
library(dplyr)
library(tidyr)
library(ggplot2)
library(nlme)
# Model A ---------------------------------------------------------------------
# Model fit
cognitive_growth_fit_A <- nlme(
nmoves ~ 1 + SSlogis(game, 19, g00, g10),
fixed = g00 + g10 ~ 1,
random = g00 + g10 ~ 1 | id,
start = c(g00 = 12, g10 = .2),
data = groupedData(
nmoves ~ game | id,
data = cognitive_growth
)
)
# Table 6.6
summary(cognitive_growth_fit_A)
#> Nonlinear mixed-effects model fit by maximum likelihood
#> Model: nmoves ~ 1 + SSlogis(game, 19, g00, g10)
#> Data: groupedData(nmoves ~ game | id, data = cognitive_growth)
#> AIC BIC logLik
#> 2696.182 2720.771 -1342.091
#>
#> Random effects:
#> Formula: list(g00 ~ 1, g10 ~ 1)
#> Level: id
#> Structure: General positive-definite, Log-Cholesky parametrization
#> StdDev Corr
#> g00 3.035373e-05 g00
#> g10 1.647815e-04 0
#> Residual 4.938179e+00
#>
#> Fixed effects: g00 + g10 ~ 1
#> Value Std.Error DF t-value p-value
#> g00 21.569170 0.7172917 427 30.07029 0
#> g10 9.935048 0.9014383 427 11.02133 0
#> Correlation:
#> g00
#> g10 0.547
#>
#> Standardized Within-Group Residuals:
#> Min Q1 Med Q3 Max
#> -2.0318685 -0.6578587 -0.1092238 0.5023658 2.7055619
#>
#> Number of Observations: 445
#> Number of Groups: 17
# Figure 6.10
n_games <- tibble(game = seq(from = 0, to = 30, by = 0.1))
n_moves <- tibble(nmoves = predict(cognitive_growth_fit_A, n_games, level = 0))
n_games |>
bind_cols(n_moves) |>
ggplot(aes(x = game, y = nmoves)) +
geom_line() +
coord_cartesian(ylim = c(0, 25))
# Model B ---------------------------------------------------------------------
# Model fit
cognitive_growth_fit_B <- nlme(
nmoves ~ 1 + SSlogis(game, 19, g00, g10),
fixed = g00 + g10 ~ read,
random = g00 + g10 ~ 1 | id,
start = c(g00 = 12, 0, g10 = .2, 0),
data = groupedData(
nmoves ~ game | id,
data = mutate(
cognitive_growth,
read = reading_score - mean(reading_score)
),
outer = ~ read
)
)
# Table 6.6
summary(cognitive_growth_fit_B)
#> Nonlinear mixed-effects model fit by maximum likelihood
#> Model: nmoves ~ 1 + SSlogis(game, 19, g00, g10)
#> Data: groupedData(nmoves ~ game | id, data = mutate(cognitive_growth, read = reading_score - mean(reading_score)), outer = ~read)
#> AIC BIC logLik
#> 2684.403 2717.188 -1334.202
#>
#> Random effects:
#> Formula: list(g00 ~ 1, g10 ~ 1)
#> Level: id
#> Structure: General positive-definite, Log-Cholesky parametrization
#> StdDev Corr
#> g00.(Intercept) 2.976235e-05 g00.(I
#> g10.(Intercept) 1.588438e-04 0
#> Residual 4.851401e+00
#>
#> Fixed effects: g00 + g10 ~ read
#> Value Std.Error DF t-value p-value
#> g00.(Intercept) 21.885552 0.7700832 425 28.419725 0.0000
#> g00.read -2.836636 0.6893668 425 -4.114843 0.0000
#> g10.(Intercept) 10.033495 0.9459492 425 10.606801 0.0000
#> g10.read -1.841746 0.7936232 425 -2.320680 0.0208
#> Correlation:
#> g00.(I g00.rd g10.(I
#> g00.read -0.451
#> g10.(Intercept) 0.605 -0.369
#> g10.read -0.385 0.368 -0.504
#>
#> Standardized Within-Group Residuals:
#> Min Q1 Med Q3 Max
#> -2.2464622 -0.6404092 -0.1129613 0.5376651 2.6503384
#>
#> Number of Observations: 445
#> Number of Groups: 17
# Figure 6.10
n_games <- crossing(
game = seq(from = 0, to = 30, by = 0.1), read = c(-1.58, 1.58)
)
n_moves <- tibble(nmoves = predict(cognitive_growth_fit_B, n_games, level = 0))
n_games |>
bind_cols(n_moves) |>
mutate(read = factor(read)) |>
ggplot(aes(x = game, y = nmoves, colour = read)) +
geom_line() +
coord_cartesian(ylim = c(0, 25))
Created on 2023-05-27 with reprex v2.0.2
nlmer()
"needs a great deal of love" (says Ben Bolker here and in other places), and isn't as mature as nlme()
. I'm not sure if it's even possible to fit the website equations for this model (since I don't know how to specify them in deriv()
), but the textbook equations are possible.
# remotes::install_github("mccarthy-m-g/alda")
library(alda)
library(dplyr)
library(tidyr)
library(ggplot2)
library(lme4)
# Model A ---------------------------------------------------------------------
change_trajectory <- deriv(
~ 1 + (19 / (1 + g00 * exp(-g10 * t))),
namevec = c("g00", "g10"),
function.arg = c("t", "g00", "g10")
)
# Model fit
cognitive_growth_fit_A <- nlmer(
nmoves ~
change_trajectory(game, g00, g10) ~
g00 + g10 | id,
start = c(g00 = 12, g10 = .2),
data = cognitive_growth,
control = nlmerControl(
optimizer = "bobyqa"
)
)
# Table 6.6
summary(cognitive_growth_fit_A)
#> Warning in vcov.merMod(object, use.hessian = use.hessian): variance-covariance matrix computed from finite-difference Hessian is
#> not positive definite or contains NA values: falling back to var-cov estimated from RX
#> Warning in vcov.merMod(object, correlation = correlation, sigm = sig): variance-covariance matrix computed from finite-difference Hessian is
#> not positive definite or contains NA values: falling back to var-cov estimated from RX
#> Nonlinear mixed model fit by maximum likelihood ['nlmerMod']
#> Formula: nmoves ~ change_trajectory(game, g00, g10) ~ g00 + g10 | id
#> Data: cognitive_growth
#> Control: nlmerControl(optimizer = "bobyqa")
#>
#> AIC BIC logLik deviance df.resid
#> 2495.7 2520.3 -1241.9 2483.7 439
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.7955 -0.5442 -0.1227 0.5885 3.5151
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> id g00 18.782717 4.33390
#> g10 0.005294 0.07276 1.00
#> Residual 13.937331 3.73327
#> Number of obs: 445, groups: id, 17
#>
#> Fixed effects:
#> Estimate Std. Error t value
#> g00 12.29801 1.60434 7.665
#> g10 0.11755 0.01895 6.202
#>
#> Correlation of Fixed Effects:
#> g00
#> g10 0.850
# Figure 6.10
# The predict() method for `nlmerMod` objects is broken (it just returns a
# dataframe repeating the fixed effects estimates for the same number of rows
# as `newdata`), so make predictions manually.
g00_fit <- fixef(cognitive_growth_fit_A)["g00"]
g10_fit <- fixef(cognitive_growth_fit_A)["g10"]
n_games <- tibble(
game = seq(from = 0, to = 30, by = 0.1),
nmoves = 1 + (19 / (1 + g00_fit * exp(-g10_fit * game)))
)
ggplot(n_games, aes(x = game, y = nmoves)) +
geom_line() +
coord_cartesian(ylim = c(0, 25))
# Model B ---------------------------------------------------------------------
# The only way to add fixed covariates in nlmer() seems to be by writing a new
# version of the change trajectory formula with the covariates included.
change_trajectory_2 <- deriv(
~ 1 + 19 / (1 + (g00 + g01*x) * exp(-(g10 + g11*x) * t)),
namevec = c("g00", "g01", "g10", "g11"),
function.arg = c("t", "x", "g00", "g01", "g10", "g11")
)
# Model fit
cognitive_growth_fit_B <- nlmer(
nmoves ~
change_trajectory_2(game, reading_score, g00, g01, g10, g11) ~
g00 + g10 | id,
start = c(g00 = 12, g01 = -.4, g10 = .12, g11 = .04),
data = mutate(
cognitive_growth,
reading_score = reading_score - mean(reading_score)
),
control = nlmerControl(
optimizer = "bobyqa"
)
)
# Table 6.6
summary(cognitive_growth_fit_B)
#> Warning in vcov.merMod(object, use.hessian = use.hessian): variance-covariance matrix computed from finite-difference Hessian is
#> not positive definite or contains NA values: falling back to var-cov estimated from RX
#> Warning in vcov.merMod(object, use.hessian = use.hessian): variance-covariance matrix computed from finite-difference Hessian is
#> not positive definite or contains NA values: falling back to var-cov estimated from RX
#> Nonlinear mixed model fit by maximum likelihood ['nlmerMod']
#> Formula: nmoves ~ change_trajectory_2(game, reading_score, g00, g01, g10,
#> g11) ~ g00 + g10 | id
#> Data:
#> mutate(cognitive_growth, reading_score = reading_score - mean(reading_score))
#> Control: nlmerControl(optimizer = "bobyqa")
#>
#> AIC BIC logLik deviance df.resid
#> 2494.8 2527.6 -1239.4 2478.8 437
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.7708 -0.5609 -0.1112 0.6179 3.5870
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> id g00 8.547544 2.92362
#> g10 0.003988 0.06315 1.00
#> Residual 13.854616 3.72218
#> Number of obs: 445, groups: id, 17
#>
#> Fixed effects:
#> Estimate Std. Error t value
#> g00 14.11790 2.27007 6.219
#> g01 8.09201 2.69356 3.004
#> g10 0.12297 0.01744 7.050
#> g11 0.04909 0.02140 2.294
#>
#> Correlation of Fixed Effects:
#> g00 g01 g10
#> g01 0.786
#> g10 0.676 0.298
#> g11 0.150 0.560 0.039
# Figure 6.10
n_games <- crossing(
game = seq(from = 0, to = 30, by = 0.1), read = c(-1.58, 1.58)
)
g00fit <- fixef(cognitive_growth_fit_B)["g00"]
g01fit <- fixef(cognitive_growth_fit_B)["g01"]
g10fit <- fixef(cognitive_growth_fit_B)["g10"]
g11fit <- fixef(cognitive_growth_fit_B)["g11"]
n_games <- n_games |>
mutate(
nmoves =
1 +
19 / (1 + (g00fit + g01fit*read) * exp(-(g10fit + g11fit*read) * game))
)
n_games |>
mutate(read = factor(read)) |>
ggplot(aes(x = game, y = nmoves, colour = read)) +
geom_line() +
coord_cartesian(ylim = c(0, 25))
Created on 2023-05-28 with reprex v2.0.2
deriv()
approach for website equationsOkay, figured this one out now, but only for nlme()
.
# remotes::install_github("mccarthy-m-g/alda")
library(alda)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
library(ggplot2)
library(nlme)
#>
#> Attaching package: 'nlme'
#> The following object is masked from 'package:dplyr':
#>
#> collapse
# Model A ---------------------------------------------------------------------
# You can add the "u"s in the equation as parameters for deriv(). This works for
# nlme() but I can't get it to work for nlmer().
change_trajectory <- deriv(
~ 1 + (19 / (1 + (g00 + u0) * exp(-((g10 + u1)*t)))),
namevec = c("g00", "g10", "u0", "u1"),
function.arg = c("t", "g00", "g10", "u0", "u1")
)
# Model fit
cognitive_growth_fit_A <- nlme(
nmoves ~ change_trajectory(game, g00, g10, u0, u1),
fixed = g00 + g10 ~ 1,
random = u0 + u1 ~ 1 | id,
start = c(g00 = 12, g10 = .2),
data = cognitive_growth
)
# Table 6.6
summary(cognitive_growth_fit_A)
#> Nonlinear mixed-effects model fit by maximum likelihood
#> Model: nmoves ~ change_trajectory(game, g00, g10, u0, u1)
#> Data: cognitive_growth
#> AIC BIC logLik
#> 2496.188 2520.776 -1242.094
#>
#> Random effects:
#> Formula: list(u0 ~ 1, u1 ~ 1)
#> Level: id
#> Structure: General positive-definite, Log-Cholesky parametrization
#> StdDev Corr
#> u0 5.08924022 u0
#> u1 0.07370742 0.999
#> Residual 3.73618068
#>
#> Fixed effects: g00 + g10 ~ 1
#> Value Std.Error DF t-value p-value
#> g00 13.906389 1.7909643 427 7.764749 0
#> g10 0.123395 0.0191148 427 6.455451 0
#> Correlation:
#> g00
#> g10 0.862
#>
#> Standardized Within-Group Residuals:
#> Min Q1 Med Q3 Max
#> -2.83521810 -0.50639166 -0.09854295 0.58660648 3.48012223
#>
#> Number of Observations: 445
#> Number of Groups: 17
# Figure 6.10
n_games <- tibble(game = seq(from = 0, to = 30, by = 0.1))
n_moves <- tibble(nmoves = predict(cognitive_growth_fit_A, n_games, level = 0))
n_games |>
bind_cols(n_moves) |>
ggplot(aes(x = game, y = nmoves)) +
geom_line() +
coord_cartesian(ylim = c(0, 25))
# Model B ---------------------------------------------------------------------
change_trajectory_2 <- deriv(
~ 1 + 19 / (1 + g00 * exp(-(g10*t + g11*x*t + u1*t + g01*x + u0))),
namevec = c("g00", "g01", "g10", "g11", "u0", "u1"),
function.arg = c("t", "x", "g00", "g01", "g10", "g11", "u0", "u1")
)
# Model fit
cognitive_growth_fit_B <- nlme(
nmoves ~ change_trajectory_2(game, read, g00, g01, g10, g11, u0, u1),
fixed = g00 + g01 + g10 + g11 ~ 1,
random = u0 + u1 ~ 1 | id,
start = c(g00 = 12, g01 = 0, g10 = .2, g11 = 0),
data = mutate(
cognitive_growth,
read = reading_score - mean(reading_score)
)
)
# Table 6.6
summary(cognitive_growth_fit_B)
#> Nonlinear mixed-effects model fit by maximum likelihood
#> Model: nmoves ~ change_trajectory_2(game, read, g00, g01, g10, g11, u0, u1)
#> Data: mutate(cognitive_growth, read = reading_score - mean(reading_score))
#> AIC BIC logLik
#> 2495.652 2528.436 -1239.826
#>
#> Random effects:
#> Formula: list(u0 ~ 1, u1 ~ 1)
#> Level: id
#> Structure: General positive-definite, Log-Cholesky parametrization
#> StdDev Corr
#> u0 0.61431033 u0
#> u1 0.06864453 -0.783
#> Residual 3.68126991
#>
#> Fixed effects: g00 + g01 + g10 + g11 ~ 1
#> Value Std.Error DF t-value p-value
#> g00 10.784587 2.2457577 425 4.802204 0.0000
#> g01 -0.331570 0.2724959 425 -1.216790 0.2244
#> g10 0.113179 0.0187545 425 6.034764 0.0000
#> g11 0.036792 0.0245776 425 1.496980 0.1351
#> Correlation:
#> g00 g01 g10
#> g01 -0.047
#> g10 0.793 -0.029
#> g11 0.030 -0.796 0.018
#>
#> Standardized Within-Group Residuals:
#> Min Q1 Med Q3 Max
#> -2.5946895 -0.5869680 -0.1253349 0.5680505 3.5760012
#>
#> Number of Observations: 445
#> Number of Groups: 17
# Figure 6.10
n_games <- crossing(
game = seq(from = 0, to = 30, by = 0.1), read = c(-1.58, 1.58)
)
n_moves <- tibble(nmoves = predict(cognitive_growth_fit_B, n_games, level = 0))
n_games |>
bind_cols(n_moves) |>
mutate(read = factor(read)) |>
ggplot(aes(x = game, y = nmoves, colour = read)) +
geom_line() +
coord_cartesian(ylim = c(0, 25))
Created on 2023-05-28 with reprex v2.0.2
This still fails to replicate the textbook results, but ideally I'd like to be able to have a final solution whose code is structured like this. I don't want to specify separate nonlinear formulas for each model like some of the examples above.
library(alda)
library(nlme)
library(tidyverse)
# Fit models ------------------------------------------------------------------
logistic_trajectory <- deriv(
~ 1 + (19 / (1 + pi0 * exp(-(pi1 * time)))),
namevec = c("pi0", "pi1"),
function.arg = c("time", "pi0", "pi1")
)
cognitive_growth_fit_A <- nlme(
nmoves ~ logistic_trajectory(game, pi0, pi1),
fixed = pi0 + pi1 ~ 1,
random = pi0 + pi1 ~ 1,
groups = ~ id,
start = list(fixed = c(pi0 = 13, pi0 = .12)),
data = cognitive_growth
)
# The estimate for `pi0.I(reading_score - 1.95625)` is completely off. The other
# estimates are in the right ballpark.
cognitive_growth_fit_B <- update(
cognitive_growth_fit_A,
fixed = pi0 + pi1 ~ 1 + I(reading_score - 1.95625),
start = list(fixed = c(13, -.4, .12, .04))
)
cognitive_growth_fits <- list(
"Model A" = cognitive_growth_fit_A,
"Model B" = cognitive_growth_fit_B
)
cognitive_growth_fits
#> $`Model A`
#> Nonlinear mixed-effects model fit by maximum likelihood
#> Model: nmoves ~ logistic_trajectory(game, pi0, pi1)
#> Data: cognitive_growth
#> Log-likelihood: -1242.796
#> Fixed: pi0 + pi1 ~ 1
#> pi0 pi1
#> 10.7756035 0.1085348
#>
#> Random effects:
#> Formula: list(pi0 ~ 1, pi1 ~ 1)
#> Level: id
#> Structure: General positive-definite, Log-Cholesky parametrization
#> StdDev Corr
#> pi0 5.16661047 pi0
#> pi1 0.07105924 0.822
#> Residual 3.69913920
#>
#> Number of Observations: 445
#> Number of Groups: 17
#>
#> $`Model B`
#> Nonlinear mixed-effects model fit by maximum likelihood
#> Model: nmoves ~ logistic_trajectory(game, pi0, pi1)
#> Data: cognitive_growth
#> Log-likelihood: -1239.253
#> Fixed: pi0 + pi1 ~ 1 + I(reading_score - 1.95625)
#> pi0.(Intercept) pi0.I(reading_score - 1.95625)
#> 15.45639281 8.74106714
#> pi1.(Intercept) pi1.I(reading_score - 1.95625)
#> 0.12541682 0.05143399
#>
#> Random effects:
#> Formula: list(pi0 ~ 1, pi1 ~ 1)
#> Level: id
#> Structure: General positive-definite, Log-Cholesky parametrization
#> StdDev Corr
#> pi0.(Intercept) 4.04344562 p0.(I)
#> pi1.(Intercept) 0.06876041 1
#> Residual 3.71726953
#>
#> Number of Observations: 445
#> Number of Groups: 17
# Plot prototypical trajectories ----------------------------------------------
prototypical_cognitive_growth <- cognitive_growth_fits |>
map2(
list(
tibble(game = seq(from = 0, to = 30, by = 0.1)),
crossing(game = seq(from = 0, to = 30, by = 0.1), reading_score = c(-1.58, 1.58))
),
\(.fit, .df) {
.df |>
mutate(nmoves = predict(.fit, newdata = .df, level = 0))
}
) |>
list_rbind(names_to = "model") |>
mutate(reading_score = factor(reading_score, labels = c("low", "high")))
# Figure 6.10, page 232:
ggplot(prototypical_cognitive_growth, aes(x = game, y = nmoves)) +
geom_line(aes(colour = reading_score)) +
scale_color_viridis_d(
option = "G", begin = .4, end = .7, na.value = "black"
) +
coord_cartesian(ylim = c(0, 25)) +
facet_wrap(vars(model))
Created on 2024-05-14 with reprex v2.0.2
For the time being, I've settled on a solution I'm content with for Chapter 6 in PR #21 (although if anyone knowledgeable is reading this, I'd still like some insight into why the website equation/model differs from the textbook). This still doesn't match the textbook examples, but from my understanding it does at least match the equations that Singer and Willett described the models with in the textbook. From Fox and Weisberg in Nonlinear Regression, Nonlinear Least Squares, and Nonlinear Mixed Models in R:
“Unlike for linear mixed models fit by lme(), the structure of the model is specified hierarchically. The first (formula) argument is expressed in terms of patient-specific coefficients and is similar to the formula for a nonlinear regression model fit by nls() (see Section ??). The fixed argument specifies relationships between the subject-specific coefficients and subject-level characteristics, here duration. The random argument specifies the random-effect structure of the model, which is here just a random error associated with each subject-specific coefficient, allowing these coefficients to vary by subject”
So the nlme code from the previous comment should be doing what I think it's doing.
However, if I do end up changing my mind about the current solution, the textbook results based on the website equation can be replicated in a relatively clean way with this code, where the deriv()
function uses the "maximal" equation, and then terms that are unused in earlier models just get zeroed out.
library(alda)
library(nlme)
library(tidyverse)
# Fit models ------------------------------------------------------------------
logistic_function <- deriv(
~ 1 + 19 / (1 + g00 * exp(-(g10*t + g11*x*t + u1*t + g01*x + u0))),
namevec = c("g00", "g01", "g10", "g11", "u0", "u1"),
function.arg = c("t", "x", "g00", "g01", "g10", "g11", "u0", "u1")
)
cognitive_growth_fit_A <- nlme(
nmoves ~ logistic_function(game, 0, g00, 0, g10, 0, u0, u1),
fixed = g00 + g10 ~ 1,
random = u0 + u1 ~ 1,
groups = ~ id,
start = c(g00 = 13, g10 = .12),
data = cognitive_growth
)
# Model fit
cognitive_growth_fit_B <- nlme(
nmoves ~ logistic_function(
game, reading_score, g00, g01, g10, g11, u0, u1
),
fixed = g00 + g01 + g10 + g11 ~ 1,
random = u0 + u1 ~ 1,
groups = ~ id,
start = c(13, -.4, .12, .04),
data = mutate(cognitive_growth, reading_score = reading_score - 1.95625)
)
cognitive_growth_fits <- list(
"Model A" = cognitive_growth_fit_A,
"Model B" = cognitive_growth_fit_B
)
cognitive_growth_fits
#> $`Model A`
#> Nonlinear mixed-effects model fit by maximum likelihood
#> Model: nmoves ~ logistic_function(game, 0, g00, 0, g10, 0, u0, u1)
#> Data: cognitive_growth
#> Log-likelihood: -1240.846
#> Fixed: g00 + g10 ~ 1
#> g00 g10
#> 10.7405662 0.1130287
#>
#> Random effects:
#> Formula: list(u0 ~ 1, u1 ~ 1)
#> Level: id
#> Structure: General positive-definite, Log-Cholesky parametrization
#> StdDev Corr
#> u0 0.67805949 u0
#> u1 0.07502318 -0.821
#> Residual 3.67947153
#>
#> Number of Observations: 445
#> Number of Groups: 17
#>
#> $`Model B`
#> Nonlinear mixed-effects model fit by maximum likelihood
#> Model: nmoves ~ logistic_function(game, reading_score, g00, g01, g10, g11, u0, u1)
#> Data: mutate(cognitive_growth, reading_score = reading_score - 1.95625)
#> Log-likelihood: -1239.827
#> Fixed: g00 + g01 + g10 + g11 ~ 1
#> g00 g01 g10 g11
#> 10.77924990 -0.33145067 0.11313512 0.03678768
#>
#> Random effects:
#> Formula: list(u0 ~ 1, u1 ~ 1)
#> Level: id
#> Structure: General positive-definite, Log-Cholesky parametrization
#> StdDev Corr
#> u0 0.61419779 u0
#> u1 0.06863973 -0.783
#> Residual 3.68128645
#>
#> Number of Observations: 445
#> Number of Groups: 17
# Plot prototypical trajectories ----------------------------------------------
prototypical_cognitive_growth <- cognitive_growth_fits |>
map2(
list(
tibble(game = seq(from = 0, to = 30, by = 0.1)),
crossing(game = seq(from = 0, to = 30, by = 0.1), reading_score = c(-1.58, 1.58))
),
\(.fit, .df) {
.df |>
mutate(nmoves = predict(.fit, newdata = .df, level = 0))
}
) |>
list_rbind(names_to = "model") |>
mutate(reading_score = factor(reading_score, labels = c("low", "high")))
# Figure 6.10, page 232:
ggplot(prototypical_cognitive_growth, aes(x = game, y = nmoves)) +
geom_line(aes(colour = reading_score)) +
scale_color_viridis_d(
option = "G", begin = .4, end = .7, na.value = "black"
) +
coord_cartesian(ylim = c(0, 25)) +
facet_wrap(vars(model))
Created on 2024-05-17 with reprex v2.0.2
Problem
I need help specifying the mixed logistic growth models for the
alda::cognitive_growth
experiment covered in Chapter 6, section 6.4, withnlme::nlme()
and/orlme4::nlmer()
. I'm not sure if I'm getting tripped up on the syntax, the math, or both.Singer and Willet describe Model A and Model B as follows.
Model A
We adopt the following logistic level-1 function as the hypothesized individual change trajectory for the
alda::cognitive_growth
experiment:where $Y{ij}$ represents the number of moves child $i$ makes (
nmoves
) prior to a fatal error in game $j$ (game
); $\Pi{0i}$ and $\Pi{1i}$ are individual growth parameters related to the "intercept" and "slope", respectively; and $\epsilon{ij} \sim N(0, \sigma^2_\epsilon)$. This model invokes two constraints: (1) all children have identical lower and upper asymptotes; and (2) these asymptotes are set to specific values, 1 and 20, which are the minimum and maximum values ofnmoves
.And the following linear level-2 model for variation in the individual growth parameters across children:
where
This model stipulates that the level-1 logistic individual growth parameters, $\Pi{0i}$ and $\Pi{1i}$, differ across children around unknown population average values, $\gamma{00}$ and $\gamma{10}$, with unknown population residual variances $\zeta{0i}$ and $\zeta{1i}$, and covariance $\sigma_{10}$. Note that the logistic function chosen here is typically referred to as a generalized logistic function or Richard's curve.
The table below presents the results of fitting this model to the
alda::cognitive_growth
data that are reported in the textbook (Table 6.6).In Figure 6.10 they present a prototypical trajectory for the average child using predictions from this model.
Model B
We next ask whether we can predict variation in the individual growth parameters. We illustrate this process by asking whether the level-1 individual growth parameters, $\Pi{0i}$ and $\Pi{1i}$, differ by the children’s
reading_score
on a standardized reading test.Model B postulates the following level-2 submodel:
where we: (1) center
reading_score
($\mathrm{READ}$) on the sample mean to preserve the comparability of level-2 parameters across models; and (2) invoke the same assumptions about the level-2 residuals articulated in equation 6.9b.The table below presents the results of fitting this model to the
alda::cognitive_growth
data that are reported in the textbook (Table 6.6).In Figure 6.10 they present fitted trajectories for two prototypical children with reading scores two standard deviations above and below the sample mean using predictions from this model.
Additional context
There is an example website for the book showing how to fit the example models in various programs. For the nonlinear mixed models above, the person who wrote the examples notes that:
The models in the book were fit in SAS, and for the SAS code examples the estimates on the website line up with what's reported in the book (search "Table 6.6 on page 231" on the page to jump to the relevant section).
However, this model isn't explained further on the website, and it isn't obvious to me where it came from, since it isn't in the book's errata either. I don't believe this is equivalent to the textbook equations.
Based on the SAS code for Model A, the "u" terms are the unknown population residual variances $\zeta{0i}$ and $\zeta{1i}$; the "s" terms are the random effects variances $\sigma^2_0$, $\sigma^21$, $\sigma^2\epsilon$; and the "c" term is the covariance $\sigma_{10}$:
The main thing I don't understand is how the
u_0
term, which I believe corresponds to $\sigma^20$, ended up inside theexp()
function instead of being added to $\Pi{0}$.For Model B they provide the following SAS code:
This makes some sense if you apply substitution rules for $\Pi_{1}$ in the website equation (where the centred reading scores are $x$):
but it still isn't clear to me why the
u_0
term is where it is, or why the $\gamma{01}x$ term is there in the same place now too. My intuition is that these should be added to $\gamma{00}$ in the equation after applying substitution, so they shouldn't end up inside theexp()
function.