Open avivMahulya opened 1 year ago
@JainTwinkle , Do you have some advice here?
When Im trying to create checkpoint of small python script (without CUDA) I got the following error:
[40000] NOTE at writeckpt.cpp:263 in mtcp_writememoryareas; REASON='before calling to skip' (void )area.addr = 0x400000 (void )area.endAddr = 0x8ba000 area.size = 4956160 Segmentation fault
with DMTCP ver 2.6 I succeed to checkpoint and restore this python script
I got an error when I am trying to create checkpoint from my application. I'm using CUDA 11.4 and tensorRT 8.4 in my application. My plaform is Nvidia jetson Xavier NX. ARM®v8.2 64 Ubuntu 20.04.4 LTS
I got the following error in the dmtcp_launch terminal:
[41000] ERROR at fileconnlist.cpp:396 in prepareShmList; REASON='JASSERT(Util::strEndsWith(area.name, DELETED_FILE_SUFFIX)) failed' area.name = /dmabuf:
The full log is attached. Checkpoint error dmtcp.txt