Closed chstock closed 1 year ago
I tried out both packages briefly last week on the FEV-1 dataset built in to mmrm
. Similar findings. I can share my code here at some point.
One comment about this:
The covariances look different, but maybe I missed something here.
This threw me at first also. The brms
model for the residual variance involves a linear model for log(sd). The coefficients reported for sigma_age_fctXXX
are summaries of the posterior for the log(residual sd) at these levels of the covariates. If you exponentiate and then square the medians, you'll probably find they are relatively close to the diagonal elements of the "covariance estimate" output from mmrm
.
Thanks for the explanation, @andrew-bean! Then it's close too. The covariances generally appear to be a bit smaller with the brms.mmrm fit.
Variances and covariances from mmrm fit
mmrm_fit["cov"]
#> $cov
#> 08 10 12 14
#> 08 5.415458 2.716820 3.910231 2.710230
#> 10 2.716820 4.184773 2.927161 3.317160
#> 12 3.910231 2.927161 6.455743 4.130742
#> 14 2.710230 3.317160 4.130742 4.985741
Variances from brms.mmrm fit
summary(brms.mmrm_fit)[["fixed"]] |>
as_tibble(rownames = NA) |>
rownames_to_column("parameter") |>
filter(str_detect(parameter, "sigma_")) |>
select(parameter, Estimate) |>
mutate(Variance = exp(Estimate)^2)
#> # A tibble: 4 × 3
#> parameter Estimate Variance
#> <chr> <dbl> <dbl>
#> 1 sigma_age_fctX08 0.848 5.45
#> 2 sigma_age_fctX10 0.716 4.18
#> 3 sigma_age_fctX12 0.912 6.20
#> 4 sigma_age_fctX14 0.784 4.80
Covariances from brms.mmrm fit
summary(brms.mmrm_fit)[["cor_pars"]] |>
as_tibble(rownames = NA) |>
rownames_to_column("cortime") |>
select(cortime, Estimate) |>
mutate(Covariance = exp(Estimate)^2)
#> # A tibble: 6 × 3
#> cortime Estimate Covariance
#> <chr> <dbl> <dbl>
#> 1 cortime(X08,X10) 0.449 2.46
#> 2 cortime(X08,X12) 0.548 2.99
#> 3 cortime(X10,X12) 0.423 2.33
#> 4 cortime(X08,X14) 0.383 2.15
#> 5 cortime(X10,X14) 0.624 3.48
#> 6 cortime(X12,X14) 0.621 3.46
Thanks for your work on this, @chstock (and for your comments, @andrew-bean)!
A first thought: interesting choice of dataset. I think this is one where we really need the full flexibility of a fully parameterized covariance matrix. Toeplitz doesn't fit because cortime(X12,X14) = 0.621
seems to be meaningfully different from the other elements on its off-diagonal band (cortime(X10,X12) = 0.432
and cortime(X8,X10) = 0.449
), and AR(1) would struggle even more because the next furthest off-diagonal band has even higher positive correlations (cortime(X10,X14) = 0.624
and cortime(X08,X12) = 0.548
).
- When role = "response" in brm_data(), I understood from the documentation that no level_baseline = ... is required, which makes sense to me. However, I received an error when trying level_baseline = NULL. (With effect_baseline = FALSE and effect_time = TRUE in brm_formula() the parameterization looked good.)
Sorry about the confusion. The error behavior is correct and the documentation is incorrect. When the outcome variable is the raw response, we need to identify a baseline level so we can produce marginal estimates of change from baseline. Should be fixed in 25bfd4e9a97099c3a463359151837ac36bbea6f1.
- Changing the reference level of the groupvariable through level_control in brm_data() did not seem have an effect.
Yes, level_control
and level_baseline
only apply to downstream post-processing functions like brm_marginal_draws()
. These arguments do not control the fixed effect parameterization that brms
chooses based on the data and formula.
- We might want to add to the documentation how the ordering and the reference of the time variable (character or factor variable) is determined and that appropriate values need to be chosen to obtain the desired parameterization.
Looking at https://github.com/paul-buerkner/brms/blob/ca12a94a39318c10cdebe96425dc578f9ad926ab/R/data-predictor.R#L977-L996, the fixed effect parameterization seems to come from a call to model.matrix()
which is called from within brms::make_standata()
. The X
field of the output has the fixed effects model matrix.
For our MMRM models with continuous outcomes, the model.matrix()
call has two arguments: the dataset, and a special formula decorated with attributes which do not seem to affect the processing. brms
does not seem to expose other arguments, but users can set the contrasts
option. From ?options
:
contrasts:
the default contrasts used in model fitting such as with aov or lm. A character vector of length two, the first giving the function to be used with unordered factors and the second the function to be used with ordered factors. By default the elements are named c("unordered", "ordered"), but the names are unused.
1652e71e073f118f6d66177fe10a6b47b1a52ccb adds explanations in the Rd files and usage vignette.
I put your results together side-by side and added the exp(sigma)
correction in a reprex (expand "details" belown to see the all the code).
I got these results:
# A tibble: 18 × 7
variable brms_estimate mmrm_estimate diff_estimate brms_se mmrm_se diff_se
<chr> <num> <num> <num> <num> <num> <num>
1 intercept 21.2 21.2 0.00157 0.733 0.702 0.0316
2 age_fct10 1.03 1.05 -0.0142 0.725 0.615 0.110
3 age_fct12 1.89 1.91 -0.0201 0.722 0.607 0.115
4 age_fct14 2.88 2.91 -0.0251 0.786 0.673 0.113
5 sexmale 1.68 1.69 -0.0119 0.951 0.911 0.0395
6 age_fct10:sexmale -0.0874 -0.108 0.0206 0.928 0.799 0.129
7 age_fct12:sexmale 0.944 0.935 0.00960 0.956 0.788 0.167
8 age_fct14:sexmale 1.70 1.68 0.0163 1.02 0.874 0.145
9 sigma_08 2.38 2.33 0.0542 0.357 NA NA
10 sigma_10 2.06 2.05 0.0104 0.286 NA NA
11 sigma_12 2.52 2.54 -0.0182 0.345 NA NA
12 sigma_14 2.21 2.23 -0.0241 0.302 NA NA
13 correlation_08_10 0.445 0.571 -0.126 0.152 NA NA
14 correlation_08_12 0.541 0.661 -0.120 0.139 NA NA
15 correlation_10_12 0.420 0.563 -0.143 0.150 NA NA
16 correlation_08_14 0.377 0.522 -0.145 0.157 NA NA
17 correlation_10_14 0.620 0.726 -0.107 0.117 NA NA
18 correlation_12_14 0.620 0.728 -0.108 0.118 NA NA
So yes, I agree the results look close overall. Also, relative to mmrm
, brms.mmrm
seems to systematically underestimate the correlations and overestimate the standard errors of fixed effects.
Any outstanding issues with this? Do you think it is a good time to start a vignette with these real data comparisons?
- When role = "response" in brm_data(), I understood from the documentation that no level_baseline = ... is required, which makes sense to me. However, I received an error when trying level_baseline = NULL. (With effect_baseline = FALSE and effect_time = TRUE in brm_formula() the parameterization looked good.)
Sorry about the confusion. The error behavior is correct and the documentation is incorrect. When the outcome variable is the raw response, we need to identify a baseline level so we can produce marginal estimates of change from baseline. Should be fixed in 25bfd4e.
Thanks very much, @wlandau! The documentation is clear now. Still, I wonder what, from a user perspective, would trigger a model of the 'raw response', apart from the convenience if this is how the data are available (which isn't a good rationale). If I am interested in marginal estimates of change from baseline, wouldn't I then model 'change from baseline' directly? And if I decide to model 'raw response' longitudinally, am I then really interested in change from baseline? In my experience, in a clinical trial setting mostly 'change from baseIine' is modeled. I would expect someone who decides to model 'raw response' to have an interest in the treatment effect and the marginal means of the response. If it is just an option to also obtain marginal means of the change from baseline, this is fine of course. Apologies, in case I am missing something (technical) here.
Related: There is a general equivalance between both options w.r.t. treatment effect estimation, as mentioned e.g. in this EMA guideline [section 5.6.]. From the guideline:
Note that when the baseline is included as a covariate in a standard linear model, the estimated treatment effects are identical for both ‘change from baseline’ (on an additive scale) and the ‘raw outcome’ analysis. Consequently if the appropriate adjustment is done, then the choice of endpoint becomes solely an issue of interpretability.
Maybe it would be helpful and not too prescriptive if we add a note to the documentation that adjustment for baseline values is generally recommended, and perhaps that both options, 'raw response' and 'change from baseline', when adjusted for the baseline yield the same treatment effect estimates (the latter to be verified).
Any outstanding issues with this? Do you think it is a good time to start a vignette with these real data comparisons?
Sounds good to me. Let's perhaps align on Wednesday on datasets and (broadly) model specifications.
Still, I wonder what, from a user perspective, would trigger a model of the 'raw response', apart from the convenience if this is how the data are available (which isn't a good rationale).
When modeling the raw response, the formula can be specified so that placebo and treatment are constrained to have the same mean at baseline. This extra assumption is reasonable in many trials, and it may improve precision in some cases. My colleagues use this kind of model and refer to it as a "cLDA" (constrained longitudinal data analysis) model.
If I am interested in marginal estimates of change from baseline, wouldn't I then model 'change from baseline' directly?
Maybe, but treatment differences are contrasts as well, which still leaves us in a situation where we are modeling x and reporting contrasts of x.
And if I decide to model 'raw response' longitudinally, am I then really interested in change from baseline? In my experience, in a clinical trial setting mostly 'change from baseIine' is modeled.
Yes, and this is what the Novartis vignettes recommend as well.
I would expect someone who decides to model 'raw response' to have an interest in the treatment effect and the marginal means of the response. If it is just an option to also obtain marginal means of the change from baseline, this is fine of course.
Good news is that the package supports all of the above. If raw response is modeled, then the package gets you marginals of the response, change from baseline, and treatment differences of change from baseline. If change from baseline is modeled, then you get marginals of change from baseline and treatment differences of change from baseline.
Maybe it would be helpful and not too prescriptive if we add a note to the documentation that adjustment for baseline values is generally recommended, and perhaps that both options, 'raw response' and 'change from baseline', when adjusted for the baseline yield the same treatment effect estimates (the latter to be verified).
I agree it is worth discussing how opinionated we want brms.mmrm
to be, and which opinions it should have, even if it tries to make all the major options possible and convenient. I agree that baseline should be a covariate when change from baseline is the outcome variable, and citing the findings and recommendations of regulatory guidances (that EMA one in particular) will be helpful.
@wlandau, great answers, thanks!!
This issue can be closed.
I fitted MMRMs to the orthodont dataset using both
mmrm::mmrm()
andbrms.mmrm::brm_model()
. Apologies, the reprex has become a bit lengthy. Of note, the endpoint here is aresponse
, not achg
-variable. I noted a few things that are summarized below:Prerequisites
Data
mmrm
brms.mmrm
Some observations and possible issues
group
variable throughlevel_control
inbrm_data()
did not seem have an effect.role = "response"
inbrm_data()
, I understood from the documentation that nolevel_baseline = ...
is required, which makes sense to me. However, I received an error when tryinglevel_baseline = NULL
. (Witheffect_baseline = FALSE
andeffect_time = TRUE
inbrm_formula()
the parameterization looked good.)Perhaps we can briefly discuss this at the next meeting.