ICB-DCM / pyABC

distributed, likelihood-free inference
https://pyabc.rtfd.io
BSD 3-Clause "New" or "Revised" License
205 stars 44 forks source link

Saving particles upon acceptance; allow resuming from partially finished generations #573

Closed cheekolegend closed 2 years ago

cheekolegend commented 2 years ago

Feature description Allow storing particles upon acceptance, and allow resuming a run from a partially finished generation (e.g., resume a run where the second generation already has some number of accepted particles)

Motivation/Application My simulations require several hours to complete, and they are run on a HPC cluster where jobs can only run for a limited amount of time. When the time limit is up, I lose many hours of data if a generation did not complete. Storing particles in the database upon acceptance would save a lot of time.

EmadAlamoudi commented 2 years ago

Dear @cheekolegend , Thanks for the suggestion. However, doing so might introduce bias toward the accepted particles. The bias be toward particles with parameters vectors that has short-running simulations.

What actually happed is that, we wait until all $N$ particles are finished, and out of that then $\tilde N\geq N$ accepted particles, only the $N$ that started earliest are considered as the population of accepted particles for that generation. This ensures that the acceptance and admittance of a particle is in accordance with the target distribution, as the acceptance of a particle is independent of later events and thus its run-time.

cheekolegend commented 2 years ago

Hi @EmadAlamoudi, thank you very much for clarifying.

yannikschaelte commented 2 years ago

Hi @cheekolegend , let me complementarily say that we strive to implement at some point a better checkpointing system, which would allow to recover runs to a high degree that were stopped midway due to HPC timeouts. However, right now I cannot say when this will be implemented.