easystats / bayestestR

:ghost: Utilities for analyzing Bayesian models and posterior distributions
https://easystats.github.io/bayestestR/
GNU General Public License v3.0
563 stars 55 forks source link

hdi() fails with brms model outputs #638

Closed poldrack closed 9 months ago

poldrack commented 9 months ago

Describe the bug hdi() fails with an error when applied to the outputs of brm().

To Reproduce Steps to reproduce the behaviour:

run the following code:

library(brms)
library(bayestestR)
df = data.frame(x = rnorm(1000), x2 = rnorm(1000)) %>%
  mutate(y = rnorm(1000) + x + x2 * .1)
model = brm(y ~ x + x2, data=df)
bayestestR::hdi(model)

This produces an error like the following:

ERROR while rich displaying an object: Error in dim(f_part) <- dim(part): dims [product 18] do not match the length of object [9]

Traceback:
1. tryCatch(withCallingHandlers({
 .     if (!mime %in% names(repr::mime2repr)) 
 .         stop("No repr_* for mimetype ", mime, " in repr::mime2repr")
 .     rpr <- repr::mime2repr[[mime]](obj)
 .     if (is.null(rpr)) 
 .         return(NULL)
 .     prepare_content(is.raw(rpr), rpr)
 . }, error = error_handler), error = outer_handler)
2. tryCatchList(expr, classes, parentenv, handlers)
3. tryCatchOne(expr, names, parentenv, handlers[[1L]])
4. doTryCatch(return(expr), name, parentenv, handler)
5. withCallingHandlers({
 .     if (!mime %in% names(repr::mime2repr)) 
 .         stop("No repr_* for mimetype ", mime, " in repr::mime2repr")
 .     rpr <- repr::mime2repr[[mime]](obj)
 .     if (is.null(rpr)) 
 .         return(NULL)
 .     prepare_content(is.raw(rpr), rpr)
 . }, error = error_handler)
6. repr::mime2repr[[mime]](obj)
7. repr_text.data.frame(obj)
8. ellip_limit_arr(obj, rows, cols)
9. arr_parts_format(parts)
10. structure(lapply(parts, arr_part_format), omit = attr(parts, 
  .     "omit"))
11. lapply(parts, arr_part_format)
12. FUN(X[[i]], ...)

A working example is provided at https://colab.research.google.com/drive/1XYgOX7Op-vNRDSobV7fWiMZlkK7IXE2z?usp=sharing

Expected behaviour Expected output is a standard table containing hdi() results

Specifications (please complete the following information):

This has been tested on both current release and development versions of brms and bayestest, on R 4.3.2 on linux.

sessionInfo() output for example above:

R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bayestestR_0.13.1 brms_2.20.4       Rcpp_1.0.11       lubridate_1.9.3  
 [5] forcats_1.0.0     stringr_1.5.1     dplyr_1.1.4       purrr_1.0.2      
 [9] readr_2.1.4       tidyr_1.3.0       tibble_3.2.1      ggplot2_3.4.4    
[13] tidyverse_2.0.0  

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.0     IRdisplay_1.1        farver_2.1.1        
 [4] loo_2.6.0            fastmap_1.1.1        tensorA_0.36.2      
 [7] shinystan_2.6.0      shinyjs_2.1.0        promises_1.2.1      
[10] digest_0.6.33        timechange_0.2.0     mime_0.12           
[13] lifecycle_1.0.4      StanHeaders_2.26.28  ellipsis_0.3.2      
[16] processx_3.8.2       magrittr_2.0.3       posterior_1.5.0     
[19] compiler_4.3.2       rlang_1.1.2          tools_4.3.2         
[22] igraph_1.5.1         utf8_1.2.4           prettyunits_1.2.0   
[25] bridgesampling_1.1-2 htmlwidgets_1.6.4    pkgbuild_1.4.2      
[28] plyr_1.8.9           repr_1.1.6           dygraphs_1.1.1.6    
[31] abind_1.4-5          miniUI_0.1.1.1       pbdZMQ_0.3-10       
[34] withr_2.5.2          stats4_4.3.2         grid_4.3.2          
[37] fansi_1.0.5          xts_0.13.1           inline_0.3.19       
[40] xtable_1.8-4         colorspace_2.1-0     scales_1.2.1        
[43] gtools_3.9.5         insight_0.19.7       cli_3.6.1           
[46] mvtnorm_1.2-4        crayon_1.5.2         generics_0.1.3      
[49] RcppParallel_5.1.7   reshape2_1.4.4       tzdb_0.4.0          
[52] rstan_2.32.3         shinythemes_1.2.0    bayesplot_1.10.0    
[55] parallel_4.3.2       matrixStats_1.1.0    base64enc_0.1-3     
[58] vctrs_0.6.4          Matrix_1.6-3         jsonlite_1.8.7      
[61] callr_3.7.3          hms_1.1.3            crosstalk_1.2.1     
[64] glue_1.6.2           codetools_0.2-19     ps_1.7.5            
[67] DT_0.30              distributional_0.3.2 stringi_1.8.2       
[70] gtable_0.3.4         QuickJSR_1.0.8       later_1.3.2         
[73] munsell_0.5.0        colourpicker_1.3.0   pillar_1.9.0        
[76] htmltools_0.5.7      Brobdingnag_1.2-9    IRkernel_1.3.2      
[79] R6_2.5.1             evaluate_0.23        shiny_1.8.0         
[82] lattice_0.22-5       markdown_1.12        backports_1.4.1     
[85] threejs_0.3.3        httpuv_1.6.13        rstantools_2.3.1.1  
[88] uuid_1.1-1           coda_0.19-4          gridExtra_2.3       
[91] nlme_3.1-163         checkmate_2.3.1      zoo_1.8-12          
[94] pkgconfig_2.0.3     
mattansb commented 9 months ago

That error seems weird - I cannot find any simliar call in any easystats package.

I also am unable to reproduce the example locally:

library(dplyr)
library(brms)
library(bayestestR)

df = data.frame(x = rnorm(1000), x2 = rnorm(1000)) %>%
  mutate(y = rnorm(1000) + x + x2 * .1)

model = brm(y ~ x + x2, data=df)

bayestestR::hdi(model)
#> Highest Density Interval 
#> 
#> Parameter   |       95% HDI
#> ---------------------------
#> (Intercept) | [-0.04, 0.09]
#> x           | [ 0.92, 1.05]
#> x2          | [ 0.07, 0.19]
strengejacke commented 9 months ago

Yeah, nothing of the error indicates that it could be an issue with the hdi() function.

strengejacke commented 9 months ago

Can you provide a reproducible example from your machine? (like, copy the code to clipboard and run reprex::reprex())

For me, it works fine:

library(brms)
library(bayestestR)
d <- data.frame(x = rnorm(1000), x2 = rnorm(1000))
d$y <- rnorm(1000) + d$x + d$x2 * 0.1
model = brm(y ~ x + x2, data = d)
bayestestR::hdi(model)
#> Highest Density Interval
#> 
#> Parameter   |       95% HDI
#> ---------------------------
#> (Intercept) | [-0.06, 0.06]
#> x           | [ 0.99, 1.12]
#> x2          | [ 0.02, 0.15]

Created on 2023-12-19 with reprex v2.0.2

poldrack commented 9 months ago

It appears that it is some kind of weird interaction with Jupyter. The code from @strengejacke example runs perfectly well in an R console or using Rscript but the same code gives the aforementoined error when run on the same machine within a Jupyter notebook cell or an R console window within Jupyter. I'll try to do some deeper comparisons to see if I can identify the source of the issue but interested in any thoughts.

mattansb commented 9 months ago

@poldrack can you try running the following, just to try and locate the error:

library(brms)

d <- data.frame(x = rnorm(1000), x2 = rnorm(1000))
d$y <- rnorm(1000) + d$x + d$x2 * 0.1

H1 <- bayestestR::hdi(d)
print(H1)

model = brm(y ~ x + x2, data = d)

H2 <- bayestestR::hdi(model)
print(H2)
poldrack commented 9 months ago

thanks, that code ran just fine, and helped isolate the problem: the error only occurs when bayestestR:hdi() is run within Jupyter without assigning the output to variable. Seems like this is a Jupyter problem rather than a problem with this function so I am fine closing this unless @strengejacke thinks there is something to be done here. thanks again for all your help!

strengejacke commented 9 months ago

Thank you for checking this more closely! I agree, it seems that nothing needs to be done from our side.

poldrack commented 9 months ago

for the record I should note that my previous diagnosis was not exactly right: it turns out that it has to do with trying to print the outputs from Jupyter without surrounding the in an explicit print() statement. Thus, print(bayestestR::hdi(model)) works just fine as well.

DominiqueMakowski commented 9 months ago

without surrounding the in an explicit print() statement. Thus, print(bayestestR::hdi(model)) works just fine as well

Hi Russ, this is quite weird indeed. It seems like there could be a clash of methods where jupyter doesn't call bayestestR's print method automatically but something else, maybe some wrapper over print() to nicely output stuff but which unfortunately clashes (for some non-obvious reasons given the error trace)