wviechtb / metafor

A meta-analysis package for R
https://www.metafor-project.org
226 stars 52 forks source link

Wrong weigths in the forest plot of the three-level meta-analysis conducted with rma.mv() function #52

Closed alsokh closed 2 years ago

alsokh commented 2 years ago

Classification:

Bug Report

Summary

The weights depicted in the forest plot of the three-level meta-analysis conducted using the rma.mv() function are incorrect. The weights revealed via the forest.rma() are diagonal weights of the variance-covariance matrix, which are similar to the regular model without clustering. Calculating the weighted mean of the effects based on the diagonal weights does not lead to the same result reported by the three-level model. But, when using the rowsum weights, we get an overall effect similar to the one reported by the three-level model and depicted by the forest plot.

Reproducible code

# here is a reproducible Code for conducting a three-level meta-analysis using rma.mv() function
# and drawing the forest plot.
# loading essential libraries
library(tidyverse)
library(metafor)

# building the dataframe
df <- data.frame(author = c("Geng 2019", "Kim 2018", "Kim 2018",  "Kim 2018", "Kim 2020",
                            "Kim 2020",  "Kim 2020",  "Lih 2016", "Lih 2016","Lih 2019",
                            "Lih 2019",  "Lih 2019",  "Lih 2019"),
                 yi = c(6.8881651, 0.8095802, 1.2276701, 0.6665888, 2.2901223,
                        3.1653921, 2.0759750, 0.9695670, 0.3003020, 0.8151049,
                        0.6450638, 0.6594766, 0.3751020), 
                 vi = c(1.98619588, 0.10819275, 0.23767934, 0.21110852, 0.33111651,
                        0.45049268, 0.30774181, 0.24833500, 0.22472726, 0.09025413,
                        0.08766778, 0.08786364, 0.08479897))

# to add the third level to the data as "study id"
df <- df %>% 
  mutate(es_id = row_number()) %>% 
  group_by(`author`) %>% 
  mutate(study_id = cur_group_id()) %>% 
  mutate(study_No = row_number())

# building the model
meta_model <- rma.mv(data = df, yi = yi, V = vi,
                     random = ~ 1 | study_id / es_id,
                     method = "REML")
# forest plot
forest.rma(meta_model, showweights = T, digits = 4, 
           slab = paste(df$author, ".",df$study_No))

# to check if the weighted mean with the depicted weights match the overall effect in the plot or not.
w_diag<- weights.rma.mv(meta_model, "diagonal")
weighted.mean(x = meta_model$yi, w = w_diag)
# the result shows these numbers are not equal (0.88 != 2.00)

w_rowsum <- weights.rma.mv(meta_model, "rowsum")
weighted.mean(x = meta_model$yi, w = w_rowsum)
# this weighted mean is equal to the overall effect demonstrating that the weights depicted in the 
# plot are not correct and should be row sum weights
# the diagonal weights are related to the regular random effects model, but not three-level model

Output

forest plot imgae


# diagonal weights
> w_diag<- weights.rma.mv(meta_model, "diagonal")
> w_diag
         1          2          3          4          5          6          7          8          9 
 0.2755947  8.2452046  5.8151941  6.3029766  3.5349209  2.9635868  3.6501036  3.8745954  3.8938949 
        10         11         12         13 
15.0572992 15.3559194 15.3329894 15.6977202 

# weighted mean based on diagonal weights
> weighted.mean(x = meta_model$yi, w = w_diag)
[1] 0.8807281
# this number is not equal to the overall effect!

# rowsum weights
> w_rowsum <- weights.rma.mv(meta_model, "rowsum")
> w_rowsum
        1         2         3         4         5         6         7         8         9        10 
15.010137 10.851992  4.939877  5.561627  7.493503  5.507798  8.062676 10.006018 11.057157  5.219367 
       11        12        13 
 5.373347  5.361369  5.555131 

# weighted mean based on rowsum weights
> weighted.mean(x = meta_model$yi, w = w_rowsum)
[1] 1.996449
# this number is equal to the overall effect reported in the forest plot.

Notes

When conducting the model via metagen() function in the meta package, it results in the proper weights.

sessionInfo()

R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    
system code page: 1256

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] metafor_3.0-2   Matrix_1.3-4    forcats_0.5.1   stringr_1.4.0   dplyr_1.0.5     purrr_0.3.4    
 [7] readr_1.4.0     tidyr_1.1.3     tibble_3.1.0    ggplot2_3.3.5   tidyverse_1.3.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6         lubridate_1.7.10   lattice_0.20-44    assertthat_0.2.1   utf8_1.1.4        
 [6] R6_2.5.0           cellranger_1.1.0   backports_1.2.1    reprex_2.0.0       httr_1.4.2        
[11] pillar_1.6.0       rlang_0.4.10       readxl_1.3.1       performance_0.8.0  rstudioapi_0.13   
[16] minqa_1.2.4        nloptr_1.2.2.2     effectsize_0.5     mathjaxr_1.4-0     splines_4.1.1     
[21] lme4_1.1-26        statmod_1.4.35     munsell_0.5.0      broom_0.7.6        compiler_4.1.1    
[26] modelr_0.1.8       pkgconfig_2.0.3    parameters_0.15.0  insight_0.14.5     tidyselect_1.1.1  
[31] fansi_0.4.2        crayon_1.4.1       dbplyr_2.1.1       withr_2.4.2        MASS_7.3-54       
[36] grid_4.1.1         nlme_3.1-153       jsonlite_1.7.2     meta_5.0-1         gtable_0.3.0      
[41] lifecycle_1.0.0    DBI_1.1.1          magrittr_2.0.1     bayestestR_0.11.5  scales_1.1.1      
[46] datawizard_0.2.1   cli_2.5.0          stringi_1.5.3      fs_1.5.0           xml2_1.3.2        
[51] ellipsis_0.3.1     generics_0.1.0     vctrs_0.3.6        boot_1.3-28        tools_4.1.1       
[56] CompQuadForm_1.4.3 glue_1.4.2         hms_1.0.0          colorspace_2.0-0   rvest_1.0.0       
[61] haven_2.4.1     
wviechtb commented 2 years ago

This is not a bug but intentional behavior. For models without moderators, one can construct such 'row sum' weights such that the weighted mean is the one that the model provides. However, this does not generalize to other types of models. Therefore, for consistency across all models, the diagonal of the weight matrix is shown when drawing forest plots and addweights=TRUE. Note that this is also documented (see https://wviechtb.github.io/metafor/reference/forest.rma.html#details-1):

When `showweights=TRUE`, the annotations will include information about the
weights given to the observed outcomes during the model fitting. For simple
models (such as those fitted with the `rma.uni` function), these weights
correspond to the 'inverse-variance weights' (but are given in percent). For
models fitted with the `rma.mv` function, the weights are based on the
diagonal of the weight matrix. Note that the weighting structure is typically
more complex in such models (i.e., the weight matrix is usually not just a
diagonal matrix) and the weights shown therefore do not reflect this
complexity. See `weights.rma` for more details.

One can always just add the desired weights using the ilab argument to the forest plot if one wants to show different weights.

alsokh commented 2 years ago

Thank you very much for your fast and informative comments and explanations. Yes, it is not actually a bug, and I understood that the 'row sum' weights are only meaningful when there is no moderator and the model is univariable. But in case there is only an intercept in the model, is not it better to show the "row sum" weights in the forest plot instead of diagonal weights? And when the model also contains moderators and the 'showweights = T,' the error is being displayed as "the weights are not applicable or meaningful with this model, since the 'row sum' weights are not applicable and the diagonal weights are not representative." I have already edited the forest.rma() function code in my local repo, but I did not know whether I was right regarding this subject or not. I will appreciate hearing your comment.

wviechtb commented 2 years ago

It was a deliberate choice to implement the current behavior and changing this now would introduce a backwards incompatibility that I try to avoid as much as possible. Instead, I have now implemented the possibility to set showweights to a string, which can be rowsum if the model is an intercept-only 'rma.mv' model. So now you can do:

forest(meta_model, showweights = "rowsum", digits = 4, slab = paste(df$author, ".",df$study_No))

But the default (if showweights=TRUE) is to only use the diagonal, which is backwards compatible.

Sidenote: It is not good practice to call method functions directly. Use forest() and weights() and let R handle the dispatching to the appropriate method functions.

alsokh commented 2 years ago

Thank you very much for your response. So now we have 'rowsum' weights on the forest :)