Open lana28 opened 2 years ago
Hi lana28
Function 'estimateAccount' does multi-threading when the 'parallel' argument is set to TRUE, which is the default. 'estimateAccount' will run each MCMC chain on a separate core, up to the number of cores available. For instance, if you set nChain equal to 5, and there are at least 5 cores available, then 'estimateAccount' will calculate each chain in parallel. The relevant code in 'estimateAccount' starts with 'if (parallel) {'.
Unfortunately, multi-threading is less helpful that might be expected. It allows you to accumulate a posterior sample more quickly once you reach convergence, but does not allow convergence to happen more quickly.
Adopting more informative prior distributions for the system and data models can help convergence - in fact is probably the only way to achieve it.
'estimateAccount' is much more experimental than 'estimateModel' and 'estimateCounts', and we haven't had much luck getting it to work on large problems. It seems that the model is too flexible, and despite our best efforts to optimise it, the estimation strategy is not efficient. We have not tried running it on a super computer though - I would be very interested to hear how that works.
I have left Stats NZ, and among other things am currently working as a consultant with the Office for National Statistics in the UK to develop a more scalable approach to estimating accounts. The new approach uses 'particle filters' rather than MCMC, and although it is much less flexible than 'estimateAccount', in work so far it has been much faster.
John
Hi
My team has been working on the county level projection including internal migration using estimateAccount, but it takes more than a few days to convergence. So we decided to use supercomputers or clusters supported from a university. To do so, we need specify if the code to be run supports multithreading or parallelization.
I am not really familiar with this area though, is there any ways that the MCMC simulation can be done by multithreading? Also, I want to make sure if the package supports parallelization or multithread, or it using such methods doesn't affect the result of simulation.
Thank you,