Closed sebhtml closed 10 years ago
JobName issue-525-beagle-199x24-2
Machine Beagle
AllocationStatus Beagle) show_alloc Note: Allocation numbers below updated every 15 minutes.
243150 CI-CCR000040,CI-DEB000002
Path Beagle) pwd /lustre/beagle/CompBIO/biosal-THOR/
Commit Beagle) git log | head -n1 commit d6d1f090c2b6c3af34abd0f4c2438d31f866f085
Toolchain PrgEnv-cray/4.2.24
Script Beagle) cat issue-525-beagle-199x24-2.pbs
cd $PBS_O_WORKDIR
aprun -n 199 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-199x24-2 > issue-525-beagle-199x24-2.stdout
Submission Beagle) qsub issue-525-beagle-199x24-2.pbs 2811564.sdb
MachineUtilization Beagle) showq | grep sebh 2811564 sebhtml Running 4776 00:59:36 Wed Aug 13 20:14:41
ComputationLoad Beagle) showq | grep "in use" 142 active jobs 10927 of 17520 processors in use by local jobs (62.37%)
RunningTime =>> PBS: job killed: walltime 3623 exceeded limit 3600 Beagle) grep TIMER issue-525-beagle-199x24-2.stdout TIMER [Load input / Count input data] 3 minutes, 23.703995 seconds TIMER [Load input / Distribute input data] 4 minutes, 9.897171 seconds TIMER [Load input] 7 minutes, 33.601166 seconds TIMER [Build assembly graph / Distribute vertices] 27 minutes, 18.723267 seconds
Checksum Beagle) sha1sum issue-525-beagle-199x24-2/coverage_distribution.txt-canonical 01a293db48518190038eaddbaed8a47ca0323fc7 issue-525-beagle-199x24-2/coverage_distribution.txt-canonical
GoodComments BadComments The load is way too low during the arc phase:
[thorium] node/163 METRICS AliveActorCount: 93 ActiveRequestCount: 9208 HeapByteCount: 25995137024 [thorium] node/50 EPOCH LOAD 3615 s 3.44/23 (0.15) 0.08 0.08 0.07 0.07 0.07 0.08 0.07 0.07 0.07 0.08 0.07 0.07 0.08 0.07 0.08 0.06 0.91 0.98 0.06 0.08 0.07 0.09 0.09 [thorium] node/50 EPOCH WAKE_UP_COUNT 3615 s 35 25 33 19 40 20 31 15 26 37 33 17 23 30 10 30 18 4 23 11 29 20 17
The load is also quite bad too in the vertices phase:
[thorium] node/57 EPOCH LOAD 900 s 2.97/23 (0.13) 0.10 0.16 0.13 0.13 0.10 0.13 0.13 0.12 0.13 0.13 0.13 0.13 0.15 0.13 0.13 0.13 0.12 0.14 0.13 0.12 0.13 0.14 0.12 [thorium] node/57 EPOCH WAKE_UP_COUNT 900 s 46 47 45 44 19 46 45 49 46 47 47 47 42 43 53 45 45 46 48 34 43 43 45
NeutralComments
JobName issue-525-beagle-256x24-3
Machine beagle
AllocationStatus Beagle) show_alloc Note: Allocation numbers below updated every 15 minutes.
242949 CI-CCR000040,CI-DEB000002
Path Beagle) pwd /lustre/beagle/CompBIO/biosal-THOR
Commit Beagle) git log | head -n1 commit d6d1f090c2b6c3af34abd0f4c2438d31f866f085
Toolchain PrgEnv-cray/4.2.24
Script Beagle) cat issue-525-beagle-256x24-3.pbs
cd $PBS_O_WORKDIR
aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-256x24-3 > issue-525-beagle-256x24-3.stdout
Submission Beagle) qsub issue-525-beagle-256x24-3.pbs 2811624.sdb
MachineUtilization Beagle) showq | grep sebh 2811624 sebhtml Running 6144 1:58:08 Thu Aug 14 09:29:20 Beagle) showq | grep "in use" 116 active jobs 11215 of 17520 processors in use by local jobs (64.01%)
ComputationLoad RunningTime
Beagle) grep left issue-525-beagle-256x24-3.stdout | grep -v 0.00|tail sequence store 210921887 has 26677/393216 (0.07) entries left to produce sequence store 577974178 has 6798/393216 (0.02) entries left to produce sequence store 1870131609 has 26677/393216 (0.07) entries left to produce sequence store 1845699736 has 6798/393216 (0.02) entries left to produce sequence store 1947000219 has 6798/393216 (0.02) entries left to produce sequence store 1021795355 has 6708/393216 (0.02) entries left to produce sequence store 1407589028 has 6798/393216 (0.02) entries left to produce sequence store 1579136159 has 6798/393216 (0.02) entries left to produce sequence store 210921887 has 6798/393216 (0.02) entries left to produce sequence store 1870131609 has 6798/393216 (0.02) entries left to produce
Checksum GoodComments BadComments NeutralComments
This one ran in 20 minutes: #486 on beagle 256x24
Must do #556.
JobName issue-525-beagle-256x24-4
Machine Beagle
AllocationStatus 242433 CI-CCR000040,CI-DEB000002
Path /lustre/beagle/CompBIO/biosal-THOR/biosal
Commit commit 12a5fb3cf0446870a3e7e3056fe9e2adb4607e03
Toolchain PrgEnv-cray/4.2.24
Script Beagle) cat issue-525-beagle-256x24-4.pbs
cd $PBS_O_WORKDIR
aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-256x24-4 > issue-525-beagle-256x24-4.stdout
Submission Beagle) qsub issue-525-beagle-256x24-4.pbs 2811827.sdb
MachineUtilization Beagle) showq | grep sebh 2811827 sebhtml Running 6144 1:59:53 Thu Aug 14 17:28:34 Beagle) showq | grep "in use" 121 active jobs 11527 of 17520 processors in use by local jobs (65.79%)
ComputationLoad RunningTime Checksum GoodComments BadComments Ran out of memory,
see ticket #557
NeutralComments
JobName issue-525-beagle-256x24-5
Machine Beagle at CI
AllocationStatus 242335 CI-CCR000040,CI-DEB000002
Path Beagle) pwd /lustre/beagle/CompBIO/biosal-THOR
Commit Beagle) git log | head -n1 commit 417b0c21f0670250750a9fd441c534b9caf22ebf
Toolchain PrgEnv-cray/4.2.24
Script Beagle) cat issue-525-beagle-256x24-5.pbs
cd $PBS_O_WORKDIR
aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-256x24-5 > issue-525-beagle-256x24-5.stdout
Submission Beagle) qsub issue-525-beagle-256x24-5.pbs 2812117.sdb
MachineUtilization Beagle) showq | grep sebh 2812117 sebhtml Running 6144 1:59:54 Thu Aug 14 23:57:03 Beagle) showq | grep "in use" 33 active jobs 9463 of 17520 processors in use by local jobs (54.01%)
ComputationLoad RunningTime Checksum GoodComments BadComments NeutralComments
there is a leak in the code that build arcs
Possibilities:
without call to bsal_assembly_graph_store_add_arc: issue-525-beagle-256x24-6.pbs result: ... found issue here https://github.com/GeneAssembly/biosal/issues/557
JobName issue-525-beagle-256x24-6
Machine BEAGLE
AllocationStatus 242122 CI-CCR000040,CI-DEB000002
Path /lustre/beagle/CompBIO/biosal-THOR
Commit Beagle) git log | head -n1 commit b07a8a4bd1981c07a9c985c8f7624b710d2a8255
Toolchain PrgEnv-cray/4.2.24
Script Beagle) cat issue-525-beagle-256x24-6.pbs
cd $PBS_O_WORKDIR
aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-256x24-6 > issue-525-beagle-256x24-6.stdout
Submission Beagle) qsub issue-525-beagle-256x24-6.pbs 2812713.sdb
MachineUtilization Beagle) showq | grep sebh 2812713 sebhtml Running 6144 1:59:17 Fri Aug 15 10:04:41 Beagle) showq | grep "in use" 38 active jobs 9463 of 17472 processors in use by local jobs (54.16%)
ComputationLoad RunningTime Beagle) grep TIMER issue-525-beagle-256x24-6.stdout TIMER [Load input / Count input data] 3 minutes, 22.840439 seconds TIMER [Load input / Distribute input data] 4 minutes, 0.798309 seconds TIMER [Load input] 7 minutes, 23.638733 seconds TIMER [Build assembly graph / Distribute vertices] 11 minutes, 57.111938 seconds TIMER [Build assembly graph / Distribute arcs] 105 minutes, 35.994141 seconds TIMER [Build assembly graph] 105 minutes, 35.994141 seconds TIMER [Run actor computation] 113 minutes, 0.601074 seconds
Checksum Beagle) sha1sum issue-525-beagle-256x24-6/coverage_distribution.txt-canonical 01a293db48518190038eaddbaed8a47ca0323fc7 issue-525-beagle-256x24-6/coverage_distribution.txt-canonical
GoodComments
fixed memory regulation
BadComments
too slow (~ 1 hour for 3 billions reads)
NeutralComments
JobName issue-525-beagle-256x24-7
Machine beagle
AllocationStatus 242122 CI-CCR000040,CI-DEB000002
Path Beagle) pwd /lustre/beagle/CompBIO/biosal-THOR
Commit Beagle) git log | head -n1 commit b4e024151edd28d03c84abafd2dffb03341f6ff9
Toolchain Script Beagle) cat issue-525-beagle-256x24-7.pbs
cd $PBS_O_WORKDIR
aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-256x24-7 > issue-525-beagle-256x24-7.stdout
Submission Beagle) qsub issue-525-beagle-256x24-7.pbs 2812769.sdb
MachineUtilization Beagle) showq | grep sebh 2812713 sebhtml Running 6144 1:01:02 Fri Aug 15 10:04:41 2812769 sebhtml Running 6144 1:59:52 Fri Aug 15 11:03:31 Beagle) showq | grep "in use" 32 active jobs 15439 of 17472 processors in use by local jobs (88.36%)
ComputationLoad
RunningTime Beagle) grep TIMER issue-525-beagle-256x24-7.stdout TIMER [Load input / Count input data] 3 minutes, 20.511063 seconds TIMER [Load input / Distribute input data] 4 minutes, 3.490372 seconds TIMER [Load input] 7 minutes, 24.001434 seconds TIMER [Build assembly graph / Distribute vertices] 10 minutes, 15.888245 seconds TIMER [Build assembly graph / Distribute arcs] 104 minutes, 34.131836 seconds TIMER [Build assembly graph] 104 minutes, 34.131836 seconds TIMER [Run actor computation] 111 minutes, 59.416992 seconds
Checksum Beagle) sha1sum issue-525-beagle-256x24-7/coverage_distribution.txt-canonical 01a293db48518190038eaddbaed8a47ca0323fc7 issue-525-beagle-256x24-7/coverage_distribution.txt-canonical
GoodComments BadComments NeutralComments
JobName issue-525-beagle-256x24-8
Machine Beagle
AllocationStatus 241160 CI-CCR000040,CI-DEB000002
Path /lustre/beagle/CompBIO/biosal-THOR
Commit b488c7d172ef8ca5c6e275a6c5247778052e9cb0
Toolchain PrgEnv-cray/4.2.24
Script Beagle) cat issue-525-beagle-256x24-8.pbs
cd $PBS_O_WORKDIR
aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-256x24-8 > issue-525-beagle-256x24-8.stdout
Submission Beagle) qsub issue-525-beagle-256x24-8.pbs 2812910.sdb
MachineUtilization Beagle) showq | grep sebh 2812910 sebhtml Running 6144 1:59:58 Fri Aug 15 14:38:22 Beagle) showq | grep "in use" 83 active jobs 10543 of 17520 processors in use by local jobs (60.18%)
ComputationLoad RunningTime
Checksum GoodComments BadComments NeutralComments
issue-525-beagle-256x24-9 -> too long too Beagle) grep TIMER issue-525-beagle-256x24-9.stdout TIMER [Load input / Count input data] 3 minutes, 43.321915 seconds TIMER [Load input / Distribute input data] 4 minutes, 15.897690 seconds TIMER [Load input] 7 minutes, 59.219604 seconds window/1418120707 generated 22151168 kmers from 39TIMER [Build assembly graph / Distribute vertices] 10 minutes, 4.586487 seconds TIMER [Build assembly graph / Distribute arcs] 104 minutes, 7.067383 seconds TIMER [Build assembly graph] 104 minutes, 7.067383 seconds TIMER [Run actor computation] 112 minutes, 6.612305 seconds
Find the issue it is in one of these components:
Rule out number 1:
Beagle) qsub issue-525-beagle-256x24-10.pbs 2813342.sdb
Yes, the issue is in 1.
Beagle) grep TIMER issue-525-beagle-256x24-10.stdout TIMER [Load input / Count input data] 3 minutes, 26.230225 seconds TIMER [Load input / Distribute input data] 3 minutes, 52.524109 seconds TIMER [Load input] 7 minutes, 18.754333 seconds TIMER [Build assembly graph / Distribute vertices] 10 minutes, 16.120605 seconds TIMER [Build assembly graph / Distribute arcs] 21 minutes, 52.142334 seconds TIMER [Build assembly graph] 21 minutes, 52.142334 seconds TIMER [Run actor computation] 29 minutes, 12.050781 seconds
diff --git a/genomics/assembly/assembly_graph_store.c b/genomics/assembly/assembly_graph_store.c
index 832b645..bf60dfc 100644
--- a/genomics/assembly/assembly_graph_store.c
+++ b/genomics/assembly/assembly_graph_store.c
@@ -549,6 +549,13 @@ void bsal_assembly_graph_store_push_arc_block(struct bsal_actor *self, struct bs
struct bsal_vector *input_arcs;
char *sequence;
+ /*
+ * Don't do anything to rule out that this is the problem.
+ */
+ bsal_actor_send_reply_empty(self, BSAL_ASSEMBLY_PUSH_ARC_BLOCK_REPLY);
+
+ return;
+
concrete_self = (struct bsal_assembly_graph_store *)bsal_actor_concrete_actor(self);
ephemeral_memory = bsal_actor_get_ephemeral_memory(self);
Ìterate over arcs, but don't add them:
Beagle) qsub issue-525-beagle-256x24-11.pbs 2813365.sdb
Beagle) grep TIMER issue-525-beagle-256x24-11.stdout TIMER [Load input / Count input data] 4 minutes, 2.911057 seconds TIMER [Load input / Distribute input data] 4 minutes, 11.640274 seconds TIMER [Load input] 8 minutes, 14.551331 seconds window/1940585475 generated 22151168 kmers from TIMER [Build assembly graph / Distribute vertices] 10 minutes, 14.406494 seconds TIMER [Build assembly graph / Distribute arcs] 37 minutes, 55.684570 seconds TIMER [Build assembly graph] 37 minutes, 55.684570 seconds TIMER [Run actor computation] 46 minutes, 11.080322 seconds
Redundancy check must be enabled:
Beagle) qsub issue-525-beagle-256x24-12.pbs 2813366.sdb
nop.
Hypothesis: the processing of arcs is more time consuming and therefore more actors are needed.
increase max. to 2.
Beagle) qsub issue-525-beagle-256x24-13.pbs 2813375.sdb
this was a bad idea
dynamic sequence store production size for arc blocks
Beagle) qsub issue-525-beagle-256x24-14.pbs 2813810.sdb
too slow
Beagle) qsub issue-525-beagle-256x24-15.pbs 2813812.sdb
with 2048 bytes for vertices, 512 bytes for arcs.
result: too slow
Beagle) qsub issue-525-beagle-256x24-16.pbs return without doing in anything when receiving an arc block in graph store 28 minutes
Beagle) qsub issue-525-beagle-256x24-17.pbs 2814220.sdb
break the loop in arc block reception to check if the problem is with unpacking the arc block 38 minutes
Beagle) qsub issue-525-beagle-256x24-18.pbs 2814610.sdb
improved codec utilization:
Beagle) qsub issue-525-beagle-256x24-24.pbs 2815562.sdb
result here
Beagle) grep TIME issue-525-beagle-256x24-24.stdout TIMER [Load input / Count input data] 3 minutes, 23.112717 seconds TIMER [Load input / Distribute input data] 3 minutes, 59.013504 seconds TIMER [Load input] 7 minutes, 22.126221 seconds TIMER [Build assembly graph / Distribute vertices] 7 minutes, 29.017761 seconds TIMER [Build assembly graph / Distribute arcs] 36 minutes, 19.769043 seconds TIMER [Build assembly graph] 43 minutes, 48.786621 seconds TIMER [Run actor computation] 51 minutes, 11.234863 seconds
Beagle) qsub issue-525-beagle-256x24-25.pbs 2815563.sdb
return without doing anything in bsal_assembly_graph_store_add_arc
Beagle) grep TIMER issue-525-beagle-256x24-25.stdout TIMER [Load input / Count input data] 3 minutes, 22.676834 seconds TIMER [Load input / Distribute input data] 4 minutes, 1.729416 seconds TIMER [Load input] 7 minutes, 24.406250 seconds TIMER [Build assembly graph / Distribute vertices] 7 minutes, 25.194885 seconds TIMER [Build assembly graph / Distribute arcs] 23 minutes, 9.005249 seconds TIMER [Build assembly graph] 30 minutes, 34.200195 seconds TIMER [Run actor computation] 37 minutes, 58.943359 seconds
Beagle) grep TIMER issue-525-beagle-256x24-26.stdout TIMER [Load input / Count input data] 3 minutes, 20.935852 seconds TIMER [Load input / Distribute input data] 3 minutes, 51.126938 seconds TIMER [Load input] 7 minutes, 12.062805 seconds TIMER [Build assembly graph / Distribute vertices] 7 minutes, 25.092133 seconds TIMER [Build assembly graph / Distribute arcs] 31 minutes, 28.607666 seconds TIMER [Build assembly graph] 38 minutes, 53.699707 seconds TIMER [Run actor computation] 46 minutes, 7.054932 seconds
JobName spate-strong-scaling-256x24-1
Machine beagle
AllocationStatus 238012 CI-CCR000040,CI-DEB000002
Path /lustre/beagle/CompBIO/biosal-THOR
Commit Beagle) git log|head -n1 commit ca4cc975821f687f24d4514e07dfaa6212aae78b
Toolchain PrgEnv-cray/4.2.24
Script Beagle) cat spate-strong-scaling-256x24-1.pbs
cd $PBS_O_WORKDIR
aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o spate-strong-scaling-256x24-1 > spate-strong-scaling-256x24-1.stdout
Submission Beagle) qsub spate-strong-scaling-256x24-1.pbs 2815570.sdb
MachineUtilization Beagle) showq | grep 5570 2815570 sebhtml Running 6144 00:59:36 Sat Aug 16 17:38:43 Beagle) showq | grep "in use" 31 active jobs 13687 of 17496 processors in use by local jobs (78.23%)
ComputationLoad Beagle) grep COMPUTATION spate-strong-scaling-256x24-1.stdout |grep LOAD | grep -v " s " | tail [thorium] node/245 COMPUTATION LOAD 0.32 [thorium] node/205 COMPUTATION LOAD 0.25 [thorium] node/91 COMPUTATION LOAD 0.29 [thorium] node/248 COMPUTATION LOAD 0.33 [thorium] node/204 COMPUTATION LOAD 0.25 [thorium] node/90 COMPUTATION LOAD 0.28 [thorium] node/246 COMPUTATION LOAD 0.32 [thorium] node/94 COMPUTATION LOAD 0.29 [thorium] node/78 COMPUTATION LOAD 0.29 [thorium] node/0 COMPUTATION LOAD 0.29
RunningTime Beagle) grep TIMER spate-strong-scaling-256x24-1.stdout TIMER [Load input / Count input data] 9 minutes, 18.389282 seconds TIMER [Load input / Distribute input data] 4 minutes, 16.262482 seconds TIMER [Load input] 13 minutes, 34.651733 seconds TIMER [Build assembly graph / Distribute vertices] 7 minutes, 30.769745 seconds TIMER [Build assembly graph / Distribute arcs] 29 minutes, 53.003174 seconds TIMER [Build assembly graph] 37 minutes, 23.772949 seconds TIMER [Run actor computation] 50 minutes, 58.820557 seconds
MemoryUtilization Beagle) grep ByteCount spate-strong-scaling-256x24-1.stdout|tail [thorium] node/104 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20819595264 [thorium] node/96 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20695945216 [thorium] node/7 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20808585216 [thorium] node/191 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20567068672 [thorium] node/154 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20590317568 [thorium] node/155 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20695060480 [thorium] node/156 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20576014336 [thorium] node/145 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20597690368 [thorium] node/150 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20564381696 [thorium] node/151 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20681871360
Checksum Beagle) sha1sum spate-strong-scaling-256x24-1/coverage_distribution.txt-canonical 01a293db48518190038eaddbaed8a47ca0323fc7 spate-strong-scaling-256x24-1/coverage_distribution.txt-canonical
GoodComments BadComments
distribute arc goes from 19 min to 29 min when nodes goes from 199 to 256.
NeutralComments
JobName spate-strong-scaling-256x24-2
Machine login5.beagle.ci.uchicago.edu
AllocationStatus 237435 CI-CCR000040,CI-DEB000002
Path /lustre/beagle/CompBIO/biosal-THOR
Commit Beagle) git log|head -n1 commit 2d23c1d80e8ef24ac783e9aa93f2c3d2b711fc09
Toolchain PrgEnv-cray/4.2.24
Script Beagle) cat spate-strong-scaling-256x24-2.pbs
cd $PBS_O_WORKDIR aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o spate-strong-scaling-256x24-2 > spate-strong-scaling-256x24-2.stdout
Submission Beagle) qsub spate-strong-scaling-256x24-2.pbs 2816154.sdb
MachineUtilization Beagle) showq | grep sebh 2816154 sebhtml Running 6144 00:59:46 Sat Aug 16 19:51:11 Beagle) showq | grep "in use" 26 active jobs 8983 of 17496 processors in use by local jobs (51.34%)
ComputationLoad Beagle) grep COMPUTATION spate-strong-scaling-256x24-2.stdout | grep LOAD | grep -v " s " | tail [thorium] node/242 COMPUTATION LOAD 0.50 [thorium] node/245 COMPUTATION LOAD 0.52 [thorium] node/243 COMPUTATION LOAD 0.50 [thorium] node/244 COMPUTATION LOAD 0.51 [thorium] node/184 COMPUTATION LOAD 0.53 [thorium] node/183 COMPUTATION LOAD 0.53 [thorium] node/182 COMPUTATION LOAD 0.53 [thorium] node/185 COMPUTATION LOAD 0.54 [thorium] node/15 COMPUTATION LOAD 0.52 [thorium] node/16 COMPUTATION LOAD 0.52
RunningTime Beagle) grep TIMER spate-strong-scaling-256x24-2.stdout TIMER [Load input / Count input data] 3 minutes, 26.491074 seconds TIMER [Load input / Distribute input data] 4 minutes, 5.586334 seconds TIMER [Load input] 7 minutes, 32.077423 seconds TIMER [Build assembly graph / Distribute vertices] 7 minutes, 26.598663 seconds TIMER [Build assembly graph / Distribute arcs] 18 minutes, 18.034180 seconds TIMER [Build assembly graph] 25 minutes, 44.632812 seconds TIMER [Run actor computation] 33 minutes, 17.844482 seconds
MemoryUtilization Beagle) grep ByteCount spate-strong-scaling-256x24-2.stdout | tail [thorium] node/225 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20666863616 [thorium] node/168 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20714115072 [thorium] node/27 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20873531392 [thorium] node/233 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20650512384 [thorium] node/29 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20665520128 [thorium] node/239 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20568936448 [thorium] node/30 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20840583168 [thorium] node/36 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20818513920 [thorium] node/38 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20781617152 [thorium] node/37 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20781453312
Checksum Beagle) sha1sum spate-strong-scaling-256x24-2/coverage_distribution.txt-canonical 01a293db48518190038eaddbaed8a47ca0323fc7 spate-strong-scaling-256x24-2/coverage_distribution.txt-canonical
GoodComments This is awesome.
BadComments NeutralComments
JobName issue-525-beagle-199x24-1
Machine beagle @ CI
AllocationStatus 243350 CI-CCR000040,CI-DEB000002
( node hours ) https://wiki.ci.uchicago.edu/Beagle/BeagleFAQ
Path /lustre/beagle/CompBIO/biosal-THOR
Commit Beagle) git log | head -n1 commit 3c9ec278e622d7f2a2600c5653b3e64121fa53ce
Toolchain PrgEnv-cray/4.2.24
Script Beagle) cat issue-525-beagle-199x24-1.pbs
!/bin/bash
PBS -N issue-525-beagle-199x24-1
PBS -A CI-DEB000002
PBS -l walltime=1:00:00
PBS -l mppwidth=4776
cd $PBS_O_WORKDIR
199 * 24 = 4776
aprun -n 199 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-199x24-1 > issue-525-beagle-199x24-1.stdout
Submission Beagle) qsub issue-525-beagle-199x24-1.pbs 2800460.sdb
MachineUtilization Beagle) showq | grep sebht 2800460 sebhtml Running 4776 00:59:34 Thu Aug 7 17:04:08 Beagle) showq | grep "in use" 222 active jobs 13928 of 17520 processors in use by local jobs (79.50%)
ComputationLoad
RunningTime Beagle) head -n1 issue-525-beagle-199x24-1.e2800460 =>> PBS: job killed: walltime 3625 exceeded limit 3600
Beagle) grep TIMER issue-525-beagle-199x24-1.stdout TIMER [Counting entries] 13 minutes, 35.216858 seconds TIMER [Distributing entries] 5 minutes, 13.419922 seconds TIMER [Counting entries and distributing entries] 18 minutes, 48.636719 seconds
Checksum
Beagle) sha1sum issue-525-beagle-199x24-1/coverage_distribution.txt-canonical 01a293db48518190038eaddbaed8a47ca0323fc7 issue-525-beagle-199x24-1/coverage_distribution.txt-canonical
GoodComments BadComments NeutralComments