Describe the bug I am trying to compute a bayesfactor for intercept only (random effects) meta-analytical model I have run in brms. I saved the model as a RDS after running it on our compute cluster, load it into my local R environment with readRDS() and then attempt to use bayesfactor() or similar functions to compute statistics for the model.

Just to caveat, this is my first time using the package so I expect this is fully operator error, but I can't seem to figure out the issue. Looks like a great package!

To Reproduce Heres the code:

link to large (~270MB) RDS file, a brms object https://www.dropbox.com/s/7tmr6xf7syqya87/mod_norm_logtrans_trait_2randeff.rds?dl=0


int_mod <- readRDS("path2mod") #took out my personal path to model, attaching link to RDS file
int_mod # look at mod

bayesfactor_parameters(int_mod, null = c(0,0.5))



Expected behaviour Obviously I expect to get the normal output from those function, the error I receive from all of them is Error in make.names(vnames, unique = TRUE) : invalid multibyte string 36

Specifiations (please complete the following information): Here is the session info:

> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bayestestR_0.9.0

mattansb commented 3 years ago

Cannot reproduce:


int_mod <- readRDS(choose.files()) #took out my personal path to model, attaching link to RDS file

#> Summary of Posterior Distribution 
#> Parameter   | Median |        95% CI |     pd |          ROPE | % in ROPE |  Rhat |      ESS
#> --------------------------------------------------------------------------------------------
#> (Intercept) |  -0.08 | [-0.77, 0.59] | 61.15% | [-0.10, 0.10] |    28.90% | 1.000 | 59822.00
#> Warning messages:
#> 1: Warning: Following potential variables could not be found in the data: grSpecies, cov = vcv_mat 
#> 2: Could not estimate a good default ROPE range. Using 'c(-0.1, 0.1)'. 

Can you try these two lines and report what you get?

samps <- insight::get_parameters(int_mod)
morgan-sparks commented 3 years ago

So when I run the insight:: chunk I get another, similar error (it was actually the exact error I was getting until I downloaded the developer version of bayestestR, now I get string 36 instead of 21). Obivously, with the error no object gets created to run the describe_posterior() function.

samps <- insight::get_parameters(int_mod)
Error in make.names(vnames, unique = TRUE) : invalid multibyte string 21

I am assuming this issue must be something local on my machine?

bwiernik commented 3 years ago

Can you try this:

  1. Recode the Paper.Name column to be integers with as.integer(as.factor(Paper.Name)) or similar.
  2. Refit the model with the recoded Paper.Name.
  3. See if you get the same error.
bwiernik commented 3 years ago

And if that does work, can you send the data file (just the Paper.Name column is fine) you are importing, as well as your script for importing it and a printout of the Paper.Name data.frame after you import it?

morgan-sparks commented 3 years ago

@bwiernik I am assuming this may be an issue with the names of the papers (some have weird symbols). Since my models take a silly long time on the cluster (days), would a reduced complexity model without nested effects, much shorter chains, etc. serve the same purpose?

morgan-sparks commented 3 years ago

@bwiernik I am assuming this may be an issue with the names of the papers (some have weird symbols). Since my models take a silly long time on the cluster (days), would a reduced complexity model without nested effects, much shorter chains, etc. serve the same purpose?

@bwiernik I ran a much simplified model with this fit log(temp.mn) | se(std_err) ~ 1 + (1|paper_ints), note the paper_ints in the random effect like you recommended. All of bayesfactor(), bayesfactor_parameters(), and describe_posterior() seem to be working. See below, model has no convergence so basically meaningless other than as a test for the issue.

> describe_posterior(issue_mod)
Summary of Posterior Distribution 

Parameter   | Median |        95% CI |     pd |          ROPE | % in ROPE |  Rhat |  ESS
(Intercept) |  -0.35 | [-0.66, 0.80] | 70.72% | [-0.10, 0.10] |     3.33% | 4.639 | 2.00
Warning message:
Could not estimate a good default ROPE range. Using 'c(-0.1, 0.1)'. 
> bayesfactor(issue_mod, null = log(c(0, .1)))
Sampling priors, please wait...
Bayes Factor (Null-Interval) 

Parameter   |     BF
(Intercept) | > 1000

* Evidence Against The Null: [-Inf, -2.303]> 
> bayesfactor_parameters(issue_mod, null = log(c(0,0.5)))
Sampling priors, please wait...
Bayes Factor (Null-Interval) 

Parameter   |    BF
(Intercept) | 12.06

* Evidence Against The Null: [-Inf, -0.693]
bwiernik commented 3 years ago

Okay, yeah, that works for this purpose. Looks like an issue with the text in the long paper names. Can you do this part?

And if that does work, can you send the data file (just the Paper.Name column is fine) you are importing, as well as your script for importing it and a printout of the Paper.Name data.frame after you import it?

morgan-sparks commented 3 years ago

Got it. Attached is the original column in a .csv and here is the read in and print out script. I also pasted r printed names into the second column of the .csv file: issue_paper.names.csv

trait_dat <-  read.csv(path = dir, "trait_level_data.csv")


bwiernik commented 3 years ago

Okay, yeah, you've got some non-breaking spaces encoded in Latin-1 there (\xa0), even though the file was read in as UTF-8; those are tripping R up (R is pretty bad at handling text encoding issues that are messy.

You can fix this by changing your input fileEncoding: trait_dat <- read.csv(path = dir, "trait_level_data.csv", fileEncoding = "latin1")

or by searching and replacing the non-breaking space character: trait_dat$Paper.Name <- stringr::str_replace(trait_dat$Paper.Name, "\xa0", " ")

This is some pretty deep R stuff, so not really anything that can be done on the bayestestR side.

morgan-sparks commented 3 years ago

@bwiernik Thanks very much for the help, I really appreciate it! Your factor to integer solution is a better option for my stuff anyway!

bwiernik commented 3 years ago

The approach I take with meta-analyses is to put the BibTeX/pandoc citation key for the paper from my Zotero library in my data sheet. That way, if I drop any cases in the analysis, I can use the reduced column to generate a bibliography.

morgan-sparks commented 3 years ago

Ah, smart. This is my first one and I am learning the many ways I could have made this more efficient. But that's part of the process I suppose.