Closed sebhtml closed 10 years ago
dependency: #426
[NID 00636] 2014-07-22 14:15:45 Apid 4747407: initiated application termination [NID 00636] 2014-07-22 14:15:48 Apid 4747407: OOM killer terminated this process. biosal-405-256-nodes-13.e2775124 lines 1-2/2 (END)
Beagle) qsub biosal-405-256-nodes-14.pbs 2775697.sdb
Previous job (https://anl.app.box.com/files/0/f/2209403957/1/f_19075724265):
Expected:
Beagle) sha1sum coverage_distribution.txt-canonical 01a293db48518190038eaddbaed8a47ca0323fc7 coverage_distribution.txt-canonical
Actual:
Beagle) sha1sum coverage_distribution.txt-canonical 01a293db48518190038eaddbaed8a47ca0323fc7 coverage_distribution.txt-canonical
But the job did a segmentation fault at the end:
Segmentation fault
bsal_tracer_print_stack_backtrace TRACE IS NOT AVAILABLE.
rdi 0x0 0
With GNU toolchain (to get a backtrace):
Beagle) qsub biosal-405-256-nodes-15.pbs 2775909.sdb
addr2line is broken on the Cray...
Beagle) objdump -d argonnite > biosal-405-256-nodes-15.s
let's do it with disassemble-and-get-stack.rb
Beagle) biosal/scripts/disassemble-and-get-stack.py -e argonnite < biosal-405-256-nodes-15.stack
biosal-405-256-nodes-15.o2775909 walltime=00:15:15
Stack backtrace has 11 frames
addr2line is broken on Beagle:
Beagle) addr2line -e argonnite < biosal-405-256-nodes-15.stack ??:0 ??:0 sigaction.c:0 ??:0 ??:0 ??:0 ??:0 ??:0 /lustre/beagle/CompBIO/biosal-THOR/biosal/applications/argonnite_kmer_counter/main.c:14 /usr/src/packages/BUILD/glibc-2.11.3/csu/libc-start.c:226 /usr/src/packages/BUILD/glibc-2.11.3/csu/../sysdeps/x86_64/elf/start.S:116
_pmiu_daemon(SIGCHLD): [NID 00310] [c6-0c2s4n0] [Wed Jul 23 15:08:04 2014] PE RANK 255 exit signal Segmentation fault [NID 00310] 2014-07-23 15:08:04 Apid 4749597: initiated application termination
Beagle) qsub biosal-405-256-nodes-16.pbs 2776076.sdb
Error, node/0 received signal SIGSEGV bsal_tracer_print_stack_backtrace Stack backtrace has 10 frames
4027bd: 4c 89 ef mov %r13,%rdi
4027c0: e8 1b 6d 01 00 callq 4194e0
Beagle) qsub biosal-405-256-nodes-17.pbs 2776121.sdb
Beagle) qsub biosal-405-256-nodes-18.pbs 2776124.sdb
Beagle) qsub biosal-405-256-nodes-20.pbs 2777421.sdb
Beagle) showq | grep sebh 2777421 sebhtml Running 6144 00:59:56 Fri Jul 25 03:37:28
without counters:
Beagle) qsub biosal-405-256-nodes-21.pbs 2777422.sdb
action points:
check iterations 20 and 21
biosal-405-256-nodes-20.o2777421 with counters 00:21:34
biosal-405-256-nodes-21.o2777422 walltime=00:21:48 01a293db48518190038eaddbaed8a47ca0323fc7 coverage_distribution.txt-canonical
Beagle) grep efficiency biosal-405-256-nodes-21.stdout |tail
/lustre/beagle/CompBIO/biosal-THOR
biosal-405-256-nodes-13 is queued (2775124)