paul-buerkner / brms

brms R package for Bayesian generalized multivariate non-linear multilevel models using Stan
https://paul-buerkner.github.io/brms/
GNU General Public License v2.0
1.28k stars 184 forks source link

loo and kfold parallelization error when used with custom model families #1296

Closed ajnafa closed 2 years ago

ajnafa commented 2 years ago

I'm trying to perform k-fold cross validation on a beta binomial model fit using a brms custom model family.

While both loo and kfold work fine sequentially, specifying the cores argument or attempting to parallelize via future results in an error on both Windows 10 and Ubuntu saying the required Stan function could not be found.

I've uploaded a reproducible example of the issue here

paul-buerkner commented 2 years ago

Yes, this is a known issue for which I don't have a good solution at the moment, but I will keep thinking about what can be done.

ajnafa commented 2 years ago

While I wasn't able to fix the issue with stan functions not being compatible with parallel computation, I have managed to come up with a working solution in the meantime by replacing the stan functions with the rbbinom and dbbinom functions from the extraDistr package. Until there's a more permanent natively supported solution, anyone else who runs into this into this issue can find the code that supports parallel computation here: https://github.com/ajnafa/threat-from-within/blob/main/scripts/families/beta_binomial.R

paul-buerkner commented 2 years ago

I cannot reproduce the error unfortunately on my Mac. There, it seems to work out nicely. Can anybody reproduce this error and have an idea how to fix it?

Edit: Indeed, with future and kfold it fails. Perhaps there is a fix for that.

paul-buerkner commented 2 years ago

Ok. So I have seen the problem happening when using future now but I think this comes down telling future where to search for all the required object. I don't have an immediate fix for that but it should be fixable at least.

paul-buerkner commented 2 years ago

I have added additional control over the execution of the futures via argument future_args. For example, if future doesn't know it needs a specific object or function, we can inform future about it specifically, via the globals argument. Using your example, this looks as follows:

example_kfoldloo <- kfold(fit1, K = 5, chains = 1,
                          future_args = list(globals = "beta_binomial_custom_lpmf"))

It may still be that evaluation fails in some parallel cases and on some OS especially when using exported Stan functions, but there is nothing I can do from the brms side to fix this. Let's see how this will look like in new release versions of rstan.

Closing this issue as there is nothing more I can do from the brms side right now, I think.