Closed bethven closed 6 years ago
EntangledMPI does support uncoordinated checkpointing but as the library is under heavy development I don't think checkpointing is stable right now, sorry about that.
Although you can check other well established frameworks like condor checkpoint/restart
hello, thank you very much for your quick response, with Condor can you do an uncoordinated checkpoint? I need to make an MPI program could be one of the NAS an uncoordinated checkpoint, but I do not know what tools I should use to do it. I have worked with the DMTCP library to do a coodinated checkpoint, but now I need to do an uncoodinated checkpoint using any tool. But I am a bit confused about the steps I must take to achieve it. thank you very much.
Condor does support uncoordinated checkpointing although I haven't used it in any of my work. You can also try SRS library which is a user level checkpointing library. You can define checkpoints inside your code and checkpointing would happen accordingly.
Hi, I would like to know if with upperwal/EntangledMPI you can run Uncoordinated Checkpoints with MPI applications. I need to do this and I do not know with what I can do it. Thank you very much.