richarddmorey / BayesFactor

BayesFactor R package for Bayesian data analysis with common statistical models.
https://richarddmorey.github.io/BayesFactor/
131 stars 48 forks source link

issue with names of levels of random factor #129

Closed dmaxgomez closed 5 years ago

dmaxgomez commented 5 years ago

Hi, I had problems with anovaBF apparently caused by the names of the levels of the random factor.

I managed to work out the following minimal example:

toy <- data.frame(s=c("AC","BD","AC","BD"), t=c("t1","t1","t2","t2"), x=1:4)
toy$s <- as.factor(toy$s)
toy$t <- as.factor(toy$t)

In this case, calling anovaBF(x ~ t+s, whichRandom="s", data=toy) leads to NAs in the output and to the following error:

Error in rawToChar(as.raw(out)) : embedded nul in string: 's\0\002'

But if the random factor in the toy data is modified as follows:

toy$s <- as.factor( tolower( as.character( toy$s ) ) )

then anovaBF(x ~ t+s, whichRandom="s", data=toy) works smoothly.

I am using BayesFactor 0.9.12-4.2 on R 3.5.2.

richarddmorey commented 5 years ago

Thanks for the report. I did reproduce it, so I'll look into what might be causing it.

richarddmorey commented 5 years ago

This seems to be associated with code that @jonathon-love added a while back; tagging him to see if he has any insight.

jonathon-love commented 5 years ago

hmm, surprised we haven't encountered this sooner.

so the issue is here:

https://github.com/richarddmorey/BayesFactor/blob/master/pkg/BayesFactor/R/nWayAOV-utility.R#L254-L261

i encode the names of the columns as base 64 to prevent R functions from munging them, however, the model.Matrix function here simply pastes the level together with the column name, which can break the base 64 encoding.

does anything actually use these labels? the column names of the matrix? does the user ever see them?

if not, i don't need to worry about decoding them (or i could replace the names with sequential integers?).

jonathon

richarddmorey commented 5 years ago

The user can see them, because they can request the matrix itself (it is used for various purposes).

jonathon-love commented 5 years ago

oh righto. i'll come up with a fix then.