richarddmorey / BayesFactor

BayesFactor R package for Bayesian data analysis with common statistical models.
https://richarddmorey.github.io/BayesFactor/
132 stars 49 forks source link

generalTestBF() doesn't seem to always call callbacks #50

Open jonathon-love opened 9 years ago

jonathon-love commented 9 years ago

At a guess you don't call callbacks if there's only one independent variable?

If I provide an independent variable with lots of levels (~1000), then the analysis runs to finish without calling the callback once.

Is it possible to call it in this scenario?

Trying to accommodate users who might make a mistake.

with thanks

richarddmorey commented 9 years ago

In that case, the integration is done with a native R function (integrate()), so I can't pass a callback to that function. Do you just want me to call it before and after the integrate() call?

jonathon-love commented 9 years ago

awesome! cheers

richarddmorey commented 9 years ago

Please test that and let me know if that solves your issue.

jonathon-love commented 9 years ago

so i guess it depends on what proportion of the total run time of generalTestBF() is spent inside this integrate function.

if it is closer to 50%, then this will likely work really well, but if it is closer to 0% or 100%, then it may not make any difference.

ballpark, what would you expect?

richarddmorey commented 9 years ago

It should be model/data dependent, but it should never spend very much time in that function. Do you have a case where it takes a long time?

jonathon-love commented 9 years ago

you've still got ssgo-fred.csv, yeah?

data <- read.csv("ssgo-fred.csv")

data$rt2 <- as.factor(data$rt2)

generalTestBF(rt ~ rt2, data)
richarddmorey commented 9 years ago

hmm, that's a different problem from the one I was thinking. I'll re-open it.

richarddmorey commented 9 years ago

Well, it is a slow integrate() call. I'm not sure what to do about it besides speed up the underlying code (planned), but that doesn't solve the main problem of updating progress when integrate() is working; do you have any suggestions?

jonathon-love commented 9 years ago

what if i can provide you with an equivalent integrate() which takes a callback?

richarddmorey commented 9 years ago

Sounds like a fine plan.

jonathon-love commented 9 years ago

right-o, i'll try and have something to you within the fortnight.

i can either

  1. provide you with a chunk of C code you can incorporate into BayesFactor
  2. i can export it from the JASP package

In option 2, you'd check if the JASP package is available, and if it is, call JASP:::integrate(...) instead

i think i'd prefer option 2.

what's your preference?

richarddmorey commented 9 years ago

Given that this is sure to be useful to other people writing JASP plugins, it seems (2) would be best. You might also consider versions of optim() and nlm() with callback support; I use those and they have the same issue (it just so happens that integrate() takes the most time, but those can also be slow).

jonathon-love commented 9 years ago

yup, been thinking the same thing.

jonathon-love commented 9 years ago

you're calling these functions from R, yeah? you're not calling the native implementations from C?

jonathon-love commented 9 years ago

also, what arguments do you pass into the callbacks? you pass progress information, anything else? and was it a value between 0 and 1000 for progress?

jonathon-love commented 9 years ago

ok, i've created a package called withcallbacks:

https://static.jasp-stats.org/misc/withcallbacks_1.0.tar.gz

if you can detect if this package is available, and call the integrate function provided by it, then i can start profiling where time is spent inside of it (and add callbacks to appropriate places)

jonathon-love commented 9 years ago

so it appears that there are two main code paths in integrate(), one for finite limits of integration, and one for non-finite limits of integration. does BayesFactor use one or the other exclusively?

richarddmorey commented 9 years ago

you're calling these functions from R, yeah?

yes.

also, what arguments do you pass into the callbacks? you pass progress information, anything else? and was it a value between 0 and 1000 for progress?

Only progress, no, and yes.

does BayesFactor use one or the other exclusively?

No, it uses both. However, in the time-consuming functions (ie, not t tests) the limits are infinite.

jonathon-love commented 9 years ago

if you can detect if this package is available, and call the integrate function provided by it, then i can start profiling where time is spent inside of it (and add callbacks to appropriate places)

actually, i can take care of this too, and send in a pull-request if you like.

richarddmorey commented 9 years ago

Sure.

richarddmorey commented 9 years ago

Did you ever look at this?

jonathon-love commented 9 years ago

yeah, it's on my todo list. to my knowledge, there's currently only one situation where it's an issue. it might be when you specify a single factor, which has many levels.

it depends on a long running integrate, or optim call. i've made the modified functions, i just need to get around to integrating them and testing them.