gphocs-dev / G-PhoCS

G-PhoCS is a software package for inferring ancestral population sizes, population divergence times, and migration rates from individual genome sequences.
33 stars 4 forks source link

Is checkpointing available? #88

Open alansill opened 4 months ago

alansill commented 4 months ago

We have an account holder who is running out of wall clock time, which is generously set to 48 hours on our cluster, even when using a 128-core node with the multi-threading option turned on. Since the code does not appear to be full MPI, and I assume would not parallelize over multiple nodes, the next best option would be to use checkpoint-and-restore methods to pick up with a subsequent job after the first one runs out of wall clock time.

Is this supported in this code? If not, are we correct about OpenMP but not OpenMPI or other MPI implementations being available? Are there any tips for lowering the run time for a given set of input?

igronau commented 4 months ago

Sorry, but we don't have support for "checkpoint-and-restore". It's been on my TODOs for a while, but it doesn't look like I'm going to get the time to implement this. As a result, there is no effective way to run G-PhoCS on a cluster with 48 hour time limits.