Fitting large chains into physical memory => remove coda chain slot in BayesianOutput

Here's an example memory breakdown showing a minimal DEzs MCMC configuration where the chains are of length 100k showing the mcmcOut object is using 1.2GB of memory:

mcmcOut = runMCMC(bayesianSetup = mcmcSetup, # 57 parameters
                  sampler = "DEzs",
                  settings = list(iterations = 300 * 1000, nrChains = 3))
# mcmcOut[[i]]$sampler, settings, setup, and X are small
tibble(totalMB = object.size(mcmcOut)[1] / 1E6, # 3 populations with 3 chains each
       populationMB = object.size(mcmcOut[[1]])[1] / 1E6, # mostly chain, codaChain, and Z
       chainMB = object.size(mcmcOut[[1]]$chain[[1]])[1] / 1E6, # 9 of these
       codaChainMB = object.size(mcmcOut[[1]]$codaChain[[1]])[1] / 1E6, # 9 of these
       zMB = object.size(mcmcOut[[1]]$Z)[1] / 1E6) # 3 of these
# A tibble: 1 x 5
  totalMB populationMB chainMB codaChainMB   zMB
    <dbl>        <dbl>   <dbl>       <dbl> <dbl>
1   1213.         404.    45.6        45.6  130.

For the range of nrChains recommended with DEzs this implies memory requirements of 12-20 GB for each 1M iterations in the chains, consistent with memory use I'm measuring on other DEzs runs having chains up to length 2M. That suggests 300-500 GB for unthinned r3PG calibrations reaching chain lengths of 25M iterations.

Putting aside the use of sensitivity analysis to reduce the number of parameters, I think this raises three considerations within the BayesianTools package:

chain is subject to thinning but contains all iterations by default. At 25M iterations in this 57 parameter example, each unthinned chain instance would approach 12 GB. With 9-12 instances that's 110-150 GB. From code review, it looks to me like a possible workaround for DEzs callers which don't require blocking is to run a modified version of mcmcDEzs.R with the code around pChain removed. I haven't checked other samplers, though.
runMCMC() is hard coded to populate codaChain for DE and DEzs. Workaround is presumably to copy the function definition from mcmcRun.R in the package source and delete the codaChain = coda::mcmc(out$Draws) bit.
Z can be thinned as discussed in #179. However, a full Z is of same size as chain, meaning 85-90% thinning is needed to fit it on a typical machine with 16 GB DRAM. High DRAM cloud compute instances and placing independent chains on different machines provide partial workarounds. Those still leave the problem of bringing all the chains together to calculate diagnostics, though that's a bit easier to window off disk.

This leaves me with three questions, I think.

Are there better approaches to reducing the more or less triplicate memory consumption between chain, codaChain, and Z?
Is there some practical upper bound to thin? Using something like thin = 10 isn't ideal from an efficiency standpoint (90% of potential samples get dropped) but I'm not seeing obvious problems with, say, ergodicity.
Is there some alternate, better way of working with large chains that what's considered here?

Yes, good points ...

The point about z-matrix is a duplicate of https://github.com/florianhartig/BayesianTools/issues/179. I agree there are various ways to address this, one thing I have personally always wondered is if we need z, given that it is largely identical to the chain. Anyway, comments about this should go to #179
Thinning of the chain is already implemented and I see no way to improve this.
The remaining point is to remove the coda chain - I just had a look at this, I think this was included at a time before getSample allowed coda, at least I had forgotten about it, and it seems redundant given that currently getSample is used also internally in all places where a coda object is needed.

=> Conclusion: rename this issue to remove coda chain, this can be implemented immediately.

florianhartig / BayesianTools

Fitting large chains into physical memory => remove coda chain slot in BayesianOutput #227