-
BICI does not have built-in checkpointing. On CSD3, the walltime limit for a job in the normal partition is 1.5 days. The documentation, https://docs.hpc.cam.ac.uk/hpc/user-guide/long.html, says tha…
-
Hi there,
How would you feel about integrating dmtcp (https://conference.scipy.org/scipy2013/presentation_detail.php?id=201) with nipype?
I'm currently missing it for the following. When using mrtri…
-
CRIU: https://criu.org/Main_Page
Needs kernel option: CONFIG_CHECKPOINT_RESTORE
DMTCP: http://dmtcp.sourceforge.net/
CyroPID: https://github.com/maaziz/cryopid
CyroPID is probably dead. Th…
-
I'm trying to compile dmptc in a linux container running on an apple silicon M1 processor.
The compilation stops with an error (see below).
The container runs on top of the emulator for x86_64 (appl…
-
I have 2 similar programs launched by dmtcp. Their JTRACE logs are almost same before
"
DMTCP will now try to remap this area in read/write mode as
private (zero pages…
-
The following program:
```
int main() {
while (1) {
if ((getenv("DMTCP_COORD_PORT")))
printf("DMTCP_COORD_PORT: %s\n", getenv("DMTCP_COORD_PORT"));
fflush(stdout);
sleep(1); } }
`…
-
I have been trying to checkpoint and restart an MPI application on two nodes (8 processes total). I had resolved different errors due to wrong paths, versions, and flags until I get a segfault.
Here …
-
Hello,
is there support of [UCX](https://github.com/openucx/ucx) in DMTCP? Right now when I try to launch an MPI program compiled against a recent OpenMPI with UCX I get following error message:
…
-
```
$ dmtcp_launch --no-coordinator ls
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::_M_construct null not valid
[1] 28654 abort (core dumped) dmtc…
-
I've run `make check`, it works fine!
```Verifying there is enough disk space ...
== Tests ==
dmtcp1 ckpt:PASSED rstr:PASSED; ckpt:PASSED rstr:PASSED
dmtcp2 ckpt:PASSED rstr:PASS…