Open jgphpc opened 8 years ago
not sure. never seen this from my job.
cudaMemcpyHostToDevice Async error :: ret code:
an illegal memory access was encountered
cudaMemcpyHostToDevice Async error :: ret code:
an illegal memory access was encountered
srun: error: task 937 launch failed: Error configuring interconnect
Fri Sep 2 08:31:17 2016: [PE_767]:inet_connect:inet_connect:
connect failed after 301 attempts
Fri Sep 2 08:31:17 2016: [PE_767]:_pmi_inet_setup:inet_connect failed
Fri Sep 2 08:31:17 2016: [PE_767]:_pmi_init:_pmi_inet_setup (full) returned -1
All logs are in /project/csstaff/inputs/pyfr/d/
job451657 (V14) / 24 Aug.
job451678 (V2) / 24 Aug.
=> nodes (3471 and 5137) removed from queue.
job451689 (V3) / 24 Aug.
=> nodes (3471 and 5137) removed from queue.
job451660 (V12) / 24 Aug.
=> nodes (3471 and 5137) removed from queue.
job451657 (V14) / 24 Aug.
=> nodes (3471 and 5137) removed from queue.
job451671 (V16) / 24 Aug.
=> nodes (3471 and 5137) removed from queue.