Open hyunjimoon opened 3 years ago
I just tried it myself and both versions finished in less than 20 seconds...
Are you using multiprocessing? I've used the default
library(future)
plan(multisession)
Maybe some of the underlying cmdstan fits hanged?
A quick check: aren't you using some old version of cmdstanr
(my version claims that file_path
is not a supported argument and the docs don't mention any such argument and instead use stan_file
). So maybe there is some problem there?
I've retried and this time it takes around 2 minutes for S = 100; wow how can this be finished in twenty seconds..?
I am using SBC.min_chunk_size
= 5; can increasing this to 10 help? (or is 5 empirically the best, from your experience? Also, can I think of parallel processing you mentioned in code doc and multisession as the same thing? (FYI, from here it was mentioned that future_apply
will be deprecated.)
cmdstan fits hanged?
Could some function that helps stuck fit easily spotted 1) noticed and 2) skipped (either manually or automatically) be possible? Maybe this will create a bias as a difficult fit comes from a specific posterior region?
bind_result
cannot do this.A quick check:
I've edited the code.
how can this be finished in twenty seconds..?
I have a quite powerful 20 core PC at work 😁
I am using SBC.min_chunk_size = 5; can increasing this to 10 help?
As discussed at the docs for the chunk_size
parameter of compute_results
:
We recommend setting this high enough that a single batch takes at least several seconds
On my PC even a single fit of two_normal
takes a few seconds, so I didn't increase it. Increasing this is really only useful when your fits are super fast AND you are not fitting a lot of them (the default chunk size is n_datasets / n_cores
so when n_datasets
is large your chunks are already big enough).
Could some function that helps stuck fit easily spotted 1) noticed and 2) skipped (either manually or automatically) be possible? Maybe this will create a bias as a difficult fit comes from a specific posterior region?
That would be technically quite difficult unless cmdstanr
or rstan
provide functionality for this. If you had this, I would expect skipping such fits to be quite similar to removing fits that had divergences (which is OK), because the probability of fit being stuck should be just a function of data.
I've edited the code.
But make sure you have a recent version of cmdstanr
(I think the latest is 0.4.0.9000) and Stan (2.27)- there definitely were some bugs in earlier versions of cmdstanr
that could cause cmdstanr
to hang, despite the underlying CmdStan process terminated succesfully.
FYI, from here it was mentioned that future_apply will be deprecated.
This concerns the future_lapply
function in the future
package. We are using future.apply::future_lapply()
as recommended on that page.
Upgraded from cmd_stan
2.21 to 2.27. It feels slightly better, but I felt that it tries to avoid heavy computation. Is this usual?
For instance, I was able to run this sigmoid example with mean 5 predictors previously, but updated version doesn't support it. So I had to change this to mean 3.
For future reference, to check cmdstanr
version, use sessionInfo()
.
updated version doesn't support it
What do you mean by "doesn't support" - do you get an error? Or does it take too long?
S = 10 ends within 1 minute, but with 100 it doesn't seem to finish (for over an hour) in
result_tn_100 <- compute_results(datasets_tn_100, backend_tn)
line.two_normal.stan is: