GeneAssembly / biosal

biosal is a distributed BIOlogical Sequence Actor Library. THIS IS A MIRROR.
BSD 2-Clause "Simplified" License
6 stars 1 forks source link

build graph on Beagle 256x24 to test #525

Closed sebhtml closed 10 years ago

sebhtml commented 10 years ago

JobName issue-525-beagle-199x24-1

Machine beagle @ CI

AllocationStatus 243350 CI-CCR000040,CI-DEB000002

( node hours ) https://wiki.ci.uchicago.edu/Beagle/BeagleFAQ

Path /lustre/beagle/CompBIO/biosal-THOR

Commit Beagle) git log | head -n1 commit 3c9ec278e622d7f2a2600c5653b3e64121fa53ce

Toolchain PrgEnv-cray/4.2.24

Script Beagle) cat issue-525-beagle-199x24-1.pbs

!/bin/bash

PBS -N issue-525-beagle-199x24-1

PBS -A CI-DEB000002

PBS -l walltime=1:00:00

PBS -l mppwidth=4776

cd $PBS_O_WORKDIR

199 * 24 = 4776

aprun -n 199 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-199x24-1 > issue-525-beagle-199x24-1.stdout

Submission Beagle) qsub issue-525-beagle-199x24-1.pbs 2800460.sdb

MachineUtilization Beagle) showq | grep sebht 2800460 sebhtml Running 4776 00:59:34 Thu Aug 7 17:04:08 Beagle) showq | grep "in use" 222 active jobs 13928 of 17520 processors in use by local jobs (79.50%)

ComputationLoad

RunningTime Beagle) head -n1 issue-525-beagle-199x24-1.e2800460 =>> PBS: job killed: walltime 3625 exceeded limit 3600

Beagle) grep TIMER issue-525-beagle-199x24-1.stdout TIMER [Counting entries] 13 minutes, 35.216858 seconds TIMER [Distributing entries] 5 minutes, 13.419922 seconds TIMER [Counting entries and distributing entries] 18 minutes, 48.636719 seconds

Checksum

Beagle) sha1sum issue-525-beagle-199x24-1/coverage_distribution.txt-canonical 01a293db48518190038eaddbaed8a47ca0323fc7 issue-525-beagle-199x24-1/coverage_distribution.txt-canonical

GoodComments BadComments NeutralComments

sebhtml commented 10 years ago

JobName issue-525-beagle-199x24-2

Machine Beagle

AllocationStatus Beagle) show_alloc Note: Allocation numbers below updated every 15 minutes.

243150 CI-CCR000040,CI-DEB000002

Path Beagle) pwd /lustre/beagle/CompBIO/biosal-THOR/

Commit Beagle) git log | head -n1 commit d6d1f090c2b6c3af34abd0f4c2438d31f866f085

Toolchain PrgEnv-cray/4.2.24

Script Beagle) cat issue-525-beagle-199x24-2.pbs

!/bin/bash

PBS -N issue-525-beagle-199x24-2

PBS -A CI-DEB000002

PBS -l walltime=1:00:00

PBS -l mppwidth=4776

cd $PBS_O_WORKDIR

199 * 24 = 4776

aprun -n 199 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-199x24-2 > issue-525-beagle-199x24-2.stdout

Submission Beagle) qsub issue-525-beagle-199x24-2.pbs 2811564.sdb

MachineUtilization Beagle) showq | grep sebh 2811564 sebhtml Running 4776 00:59:36 Wed Aug 13 20:14:41

ComputationLoad Beagle) showq | grep "in use" 142 active jobs 10927 of 17520 processors in use by local jobs (62.37%)

RunningTime =>> PBS: job killed: walltime 3623 exceeded limit 3600 Beagle) grep TIMER issue-525-beagle-199x24-2.stdout TIMER [Load input / Count input data] 3 minutes, 23.703995 seconds TIMER [Load input / Distribute input data] 4 minutes, 9.897171 seconds TIMER [Load input] 7 minutes, 33.601166 seconds TIMER [Build assembly graph / Distribute vertices] 27 minutes, 18.723267 seconds

Checksum Beagle) sha1sum issue-525-beagle-199x24-2/coverage_distribution.txt-canonical 01a293db48518190038eaddbaed8a47ca0323fc7 issue-525-beagle-199x24-2/coverage_distribution.txt-canonical

GoodComments BadComments The load is way too low during the arc phase:

[thorium] node/163 METRICS AliveActorCount: 93 ActiveRequestCount: 9208 HeapByteCount: 25995137024 [thorium] node/50 EPOCH LOAD 3615 s 3.44/23 (0.15) 0.08 0.08 0.07 0.07 0.07 0.08 0.07 0.07 0.07 0.08 0.07 0.07 0.08 0.07 0.08 0.06 0.91 0.98 0.06 0.08 0.07 0.09 0.09 [thorium] node/50 EPOCH WAKE_UP_COUNT 3615 s 35 25 33 19 40 20 31 15 26 37 33 17 23 30 10 30 18 4 23 11 29 20 17

The load is also quite bad too in the vertices phase:

[thorium] node/57 EPOCH LOAD 900 s 2.97/23 (0.13) 0.10 0.16 0.13 0.13 0.10 0.13 0.13 0.12 0.13 0.13 0.13 0.13 0.15 0.13 0.13 0.13 0.12 0.14 0.13 0.12 0.13 0.14 0.12 [thorium] node/57 EPOCH WAKE_UP_COUNT 900 s 46 47 45 44 19 46 45 49 46 47 47 47 42 43 53 45 45 46 48 34 43 43 45

NeutralComments

sebhtml commented 10 years ago

JobName issue-525-beagle-256x24-3

Machine beagle

AllocationStatus Beagle) show_alloc Note: Allocation numbers below updated every 15 minutes.

242949 CI-CCR000040,CI-DEB000002

Path Beagle) pwd /lustre/beagle/CompBIO/biosal-THOR

Commit Beagle) git log | head -n1 commit d6d1f090c2b6c3af34abd0f4c2438d31f866f085

Toolchain PrgEnv-cray/4.2.24

Script Beagle) cat issue-525-beagle-256x24-3.pbs

!/bin/bash

PBS -N issue-525-beagle-256x24-3

PBS -A CI-DEB000002

PBS -l walltime=2:00:00

PBS -l mppwidth=6144

cd $PBS_O_WORKDIR

aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-256x24-3 > issue-525-beagle-256x24-3.stdout

Submission Beagle) qsub issue-525-beagle-256x24-3.pbs 2811624.sdb

MachineUtilization Beagle) showq | grep sebh 2811624 sebhtml Running 6144 1:58:08 Thu Aug 14 09:29:20 Beagle) showq | grep "in use" 116 active jobs 11215 of 17520 processors in use by local jobs (64.01%)

ComputationLoad RunningTime

Beagle) grep left issue-525-beagle-256x24-3.stdout | grep -v 0.00|tail sequence store 210921887 has 26677/393216 (0.07) entries left to produce sequence store 577974178 has 6798/393216 (0.02) entries left to produce sequence store 1870131609 has 26677/393216 (0.07) entries left to produce sequence store 1845699736 has 6798/393216 (0.02) entries left to produce sequence store 1947000219 has 6798/393216 (0.02) entries left to produce sequence store 1021795355 has 6708/393216 (0.02) entries left to produce sequence store 1407589028 has 6798/393216 (0.02) entries left to produce sequence store 1579136159 has 6798/393216 (0.02) entries left to produce sequence store 210921887 has 6798/393216 (0.02) entries left to produce sequence store 1870131609 has 6798/393216 (0.02) entries left to produce

Checksum GoodComments BadComments NeutralComments

sebhtml commented 10 years ago

This one ran in 20 minutes: #486 on beagle 256x24

sebhtml commented 10 years ago

Must do #556.

sebhtml commented 10 years ago

JobName issue-525-beagle-256x24-4

Machine Beagle

AllocationStatus 242433 CI-CCR000040,CI-DEB000002

Path /lustre/beagle/CompBIO/biosal-THOR/biosal

Commit commit 12a5fb3cf0446870a3e7e3056fe9e2adb4607e03

Toolchain PrgEnv-cray/4.2.24

Script Beagle) cat issue-525-beagle-256x24-4.pbs

!/bin/bash

PBS -N issue-525-beagle-256x24-4

PBS -A CI-DEB000002

PBS -l walltime=2:00:00

PBS -l mppwidth=6144

cd $PBS_O_WORKDIR

aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-256x24-4 > issue-525-beagle-256x24-4.stdout

Submission Beagle) qsub issue-525-beagle-256x24-4.pbs 2811827.sdb

MachineUtilization Beagle) showq | grep sebh 2811827 sebhtml Running 6144 1:59:53 Thu Aug 14 17:28:34 Beagle) showq | grep "in use" 121 active jobs 11527 of 17520 processors in use by local jobs (65.79%)

ComputationLoad RunningTime Checksum GoodComments BadComments Ran out of memory,

see ticket #557

NeutralComments

sebhtml commented 10 years ago

JobName issue-525-beagle-256x24-5

Machine Beagle at CI

AllocationStatus 242335 CI-CCR000040,CI-DEB000002

Path Beagle) pwd /lustre/beagle/CompBIO/biosal-THOR

Commit Beagle) git log | head -n1 commit 417b0c21f0670250750a9fd441c534b9caf22ebf

Toolchain PrgEnv-cray/4.2.24

Script Beagle) cat issue-525-beagle-256x24-5.pbs

!/bin/bash

PBS -N issue-525-beagle-256x24-5

PBS -A CI-DEB000002

PBS -l walltime=2:00:00

PBS -l mppwidth=6144

cd $PBS_O_WORKDIR

aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-256x24-5 > issue-525-beagle-256x24-5.stdout

Submission Beagle) qsub issue-525-beagle-256x24-5.pbs 2812117.sdb

MachineUtilization Beagle) showq | grep sebh 2812117 sebhtml Running 6144 1:59:54 Thu Aug 14 23:57:03 Beagle) showq | grep "in use" 33 active jobs 9463 of 17520 processors in use by local jobs (54.01%)

ComputationLoad RunningTime Checksum GoodComments BadComments NeutralComments

sebhtml commented 10 years ago

there is a leak in the code that build arcs

Possibilities:

sebhtml commented 10 years ago

without call to bsal_assembly_graph_store_add_arc: issue-525-beagle-256x24-6.pbs result: ... found issue here https://github.com/GeneAssembly/biosal/issues/557

sebhtml commented 10 years ago

JobName issue-525-beagle-256x24-6

Machine BEAGLE

AllocationStatus 242122 CI-CCR000040,CI-DEB000002

Path /lustre/beagle/CompBIO/biosal-THOR

Commit Beagle) git log | head -n1 commit b07a8a4bd1981c07a9c985c8f7624b710d2a8255

Toolchain PrgEnv-cray/4.2.24

Script Beagle) cat issue-525-beagle-256x24-6.pbs

!/bin/bash

PBS -N issue-525-beagle-256x24-6

PBS -A CI-DEB000002

PBS -l walltime=2:00:00

PBS -l mppwidth=6144

cd $PBS_O_WORKDIR

aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-256x24-6 > issue-525-beagle-256x24-6.stdout

Submission Beagle) qsub issue-525-beagle-256x24-6.pbs 2812713.sdb

MachineUtilization Beagle) showq | grep sebh 2812713 sebhtml Running 6144 1:59:17 Fri Aug 15 10:04:41 Beagle) showq | grep "in use" 38 active jobs 9463 of 17472 processors in use by local jobs (54.16%)

ComputationLoad RunningTime Beagle) grep TIMER issue-525-beagle-256x24-6.stdout TIMER [Load input / Count input data] 3 minutes, 22.840439 seconds TIMER [Load input / Distribute input data] 4 minutes, 0.798309 seconds TIMER [Load input] 7 minutes, 23.638733 seconds TIMER [Build assembly graph / Distribute vertices] 11 minutes, 57.111938 seconds TIMER [Build assembly graph / Distribute arcs] 105 minutes, 35.994141 seconds TIMER [Build assembly graph] 105 minutes, 35.994141 seconds TIMER [Run actor computation] 113 minutes, 0.601074 seconds

Checksum Beagle) sha1sum issue-525-beagle-256x24-6/coverage_distribution.txt-canonical 01a293db48518190038eaddbaed8a47ca0323fc7 issue-525-beagle-256x24-6/coverage_distribution.txt-canonical

GoodComments

fixed memory regulation

BadComments

too slow (~ 1 hour for 3 billions reads)

NeutralComments

sebhtml commented 10 years ago

JobName issue-525-beagle-256x24-7

Machine beagle

AllocationStatus 242122 CI-CCR000040,CI-DEB000002

Path Beagle) pwd /lustre/beagle/CompBIO/biosal-THOR

Commit Beagle) git log | head -n1 commit b4e024151edd28d03c84abafd2dffb03341f6ff9

Toolchain Script Beagle) cat issue-525-beagle-256x24-7.pbs

!/bin/bash

PBS -N issue-525-beagle-256x24-7

PBS -A CI-DEB000002

PBS -l walltime=2:00:00

PBS -l mppwidth=6144

cd $PBS_O_WORKDIR

aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-256x24-7 > issue-525-beagle-256x24-7.stdout

Submission Beagle) qsub issue-525-beagle-256x24-7.pbs 2812769.sdb

MachineUtilization Beagle) showq | grep sebh 2812713 sebhtml Running 6144 1:01:02 Fri Aug 15 10:04:41 2812769 sebhtml Running 6144 1:59:52 Fri Aug 15 11:03:31 Beagle) showq | grep "in use" 32 active jobs 15439 of 17472 processors in use by local jobs (88.36%)

ComputationLoad

RunningTime Beagle) grep TIMER issue-525-beagle-256x24-7.stdout TIMER [Load input / Count input data] 3 minutes, 20.511063 seconds TIMER [Load input / Distribute input data] 4 minutes, 3.490372 seconds TIMER [Load input] 7 minutes, 24.001434 seconds TIMER [Build assembly graph / Distribute vertices] 10 minutes, 15.888245 seconds TIMER [Build assembly graph / Distribute arcs] 104 minutes, 34.131836 seconds TIMER [Build assembly graph] 104 minutes, 34.131836 seconds TIMER [Run actor computation] 111 minutes, 59.416992 seconds

Checksum Beagle) sha1sum issue-525-beagle-256x24-7/coverage_distribution.txt-canonical 01a293db48518190038eaddbaed8a47ca0323fc7 issue-525-beagle-256x24-7/coverage_distribution.txt-canonical

GoodComments BadComments NeutralComments

sebhtml commented 10 years ago

JobName issue-525-beagle-256x24-8

Machine Beagle

AllocationStatus 241160 CI-CCR000040,CI-DEB000002

Path /lustre/beagle/CompBIO/biosal-THOR

Commit b488c7d172ef8ca5c6e275a6c5247778052e9cb0

Toolchain PrgEnv-cray/4.2.24

Script Beagle) cat issue-525-beagle-256x24-8.pbs

!/bin/bash

PBS -N issue-525-beagle-256x24-8

PBS -A CI-DEB000002

PBS -l walltime=2:00:00

PBS -l mppwidth=6144

cd $PBS_O_WORKDIR

aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o issue-525-beagle-256x24-8 > issue-525-beagle-256x24-8.stdout

Submission Beagle) qsub issue-525-beagle-256x24-8.pbs 2812910.sdb

MachineUtilization Beagle) showq | grep sebh 2812910 sebhtml Running 6144 1:59:58 Fri Aug 15 14:38:22 Beagle) showq | grep "in use" 83 active jobs 10543 of 17520 processors in use by local jobs (60.18%)

ComputationLoad RunningTime

Checksum GoodComments BadComments NeutralComments

sebhtml commented 10 years ago

issue-525-beagle-256x24-9 -> too long too Beagle) grep TIMER issue-525-beagle-256x24-9.stdout TIMER [Load input / Count input data] 3 minutes, 43.321915 seconds TIMER [Load input / Distribute input data] 4 minutes, 15.897690 seconds TIMER [Load input] 7 minutes, 59.219604 seconds window/1418120707 generated 22151168 kmers from 39TIMER [Build assembly graph / Distribute vertices] 10 minutes, 4.586487 seconds TIMER [Build assembly graph / Distribute arcs] 104 minutes, 7.067383 seconds TIMER [Build assembly graph] 104 minutes, 7.067383 seconds TIMER [Run actor computation] 112 minutes, 6.612305 seconds

sebhtml commented 10 years ago

Find the issue it is in one of these components:

  1. handler that receives active message for arc block in graph store
  2. the arc classifier block classification code
  3. the arc kernel

Rule out number 1:

Beagle) qsub issue-525-beagle-256x24-10.pbs 2813342.sdb

Yes, the issue is in 1.

Beagle) grep TIMER issue-525-beagle-256x24-10.stdout TIMER [Load input / Count input data] 3 minutes, 26.230225 seconds TIMER [Load input / Distribute input data] 3 minutes, 52.524109 seconds TIMER [Load input] 7 minutes, 18.754333 seconds TIMER [Build assembly graph / Distribute vertices] 10 minutes, 16.120605 seconds TIMER [Build assembly graph / Distribute arcs] 21 minutes, 52.142334 seconds TIMER [Build assembly graph] 21 minutes, 52.142334 seconds TIMER [Run actor computation] 29 minutes, 12.050781 seconds

diff --git a/genomics/assembly/assembly_graph_store.c b/genomics/assembly/assembly_graph_store.c
index 832b645..bf60dfc 100644
--- a/genomics/assembly/assembly_graph_store.c
+++ b/genomics/assembly/assembly_graph_store.c
@@ -549,6 +549,13 @@ void bsal_assembly_graph_store_push_arc_block(struct bsal_actor *self, struct bs
     struct bsal_vector *input_arcs;
     char *sequence;

+    /*
+     * Don't do anything to rule out that this is the problem.
+     */
+    bsal_actor_send_reply_empty(self, BSAL_ASSEMBLY_PUSH_ARC_BLOCK_REPLY);
+
+    return;
+
     concrete_self = (struct bsal_assembly_graph_store *)bsal_actor_concrete_actor(self);
     ephemeral_memory = bsal_actor_get_ephemeral_memory(self);
sebhtml commented 10 years ago

Ìterate over arcs, but don't add them:

Beagle) qsub issue-525-beagle-256x24-11.pbs 2813365.sdb

Beagle) grep TIMER issue-525-beagle-256x24-11.stdout TIMER [Load input / Count input data] 4 minutes, 2.911057 seconds TIMER [Load input / Distribute input data] 4 minutes, 11.640274 seconds TIMER [Load input] 8 minutes, 14.551331 seconds window/1940585475 generated 22151168 kmers from TIMER [Build assembly graph / Distribute vertices] 10 minutes, 14.406494 seconds TIMER [Build assembly graph / Distribute arcs] 37 minutes, 55.684570 seconds TIMER [Build assembly graph] 37 minutes, 55.684570 seconds TIMER [Run actor computation] 46 minutes, 11.080322 seconds

sebhtml commented 10 years ago

Redundancy check must be enabled:

Beagle) qsub issue-525-beagle-256x24-12.pbs 2813366.sdb

nop.

Hypothesis: the processing of arcs is more time consuming and therefore more actors are needed.

sebhtml commented 10 years ago

increase max. to 2.

Beagle) qsub issue-525-beagle-256x24-13.pbs 2813375.sdb

this was a bad idea

sebhtml commented 10 years ago

dynamic sequence store production size for arc blocks

Beagle) qsub issue-525-beagle-256x24-14.pbs 2813810.sdb

too slow

sebhtml commented 10 years ago

Beagle) qsub issue-525-beagle-256x24-15.pbs 2813812.sdb

with 2048 bytes for vertices, 512 bytes for arcs.

result: too slow

sebhtml commented 10 years ago

Beagle) qsub issue-525-beagle-256x24-16.pbs return without doing in anything when receiving an arc block in graph store 28 minutes

sebhtml commented 10 years ago

Beagle) qsub issue-525-beagle-256x24-17.pbs 2814220.sdb

break the loop in arc block reception to check if the problem is with unpacking the arc block 38 minutes

sebhtml commented 10 years ago

Beagle) qsub issue-525-beagle-256x24-18.pbs 2814610.sdb

sebhtml commented 10 years ago

improved codec utilization:

Beagle) qsub issue-525-beagle-256x24-24.pbs 2815562.sdb

result here

Beagle) grep TIME issue-525-beagle-256x24-24.stdout TIMER [Load input / Count input data] 3 minutes, 23.112717 seconds TIMER [Load input / Distribute input data] 3 minutes, 59.013504 seconds TIMER [Load input] 7 minutes, 22.126221 seconds TIMER [Build assembly graph / Distribute vertices] 7 minutes, 29.017761 seconds TIMER [Build assembly graph / Distribute arcs] 36 minutes, 19.769043 seconds TIMER [Build assembly graph] 43 minutes, 48.786621 seconds TIMER [Run actor computation] 51 minutes, 11.234863 seconds

sebhtml commented 10 years ago

Beagle) qsub issue-525-beagle-256x24-25.pbs 2815563.sdb

return without doing anything in bsal_assembly_graph_store_add_arc

Beagle) grep TIMER issue-525-beagle-256x24-25.stdout TIMER [Load input / Count input data] 3 minutes, 22.676834 seconds TIMER [Load input / Distribute input data] 4 minutes, 1.729416 seconds TIMER [Load input] 7 minutes, 24.406250 seconds TIMER [Build assembly graph / Distribute vertices] 7 minutes, 25.194885 seconds TIMER [Build assembly graph / Distribute arcs] 23 minutes, 9.005249 seconds TIMER [Build assembly graph] 30 minutes, 34.200195 seconds TIMER [Run actor computation] 37 minutes, 58.943359 seconds

Beagle) grep TIMER issue-525-beagle-256x24-26.stdout TIMER [Load input / Count input data] 3 minutes, 20.935852 seconds TIMER [Load input / Distribute input data] 3 minutes, 51.126938 seconds TIMER [Load input] 7 minutes, 12.062805 seconds TIMER [Build assembly graph / Distribute vertices] 7 minutes, 25.092133 seconds TIMER [Build assembly graph / Distribute arcs] 31 minutes, 28.607666 seconds TIMER [Build assembly graph] 38 minutes, 53.699707 seconds TIMER [Run actor computation] 46 minutes, 7.054932 seconds

sebhtml commented 10 years ago

JobName spate-strong-scaling-256x24-1

Machine beagle

AllocationStatus 238012 CI-CCR000040,CI-DEB000002

Path /lustre/beagle/CompBIO/biosal-THOR

Commit Beagle) git log|head -n1 commit ca4cc975821f687f24d4514e07dfaa6212aae78b

Toolchain PrgEnv-cray/4.2.24

Script Beagle) cat spate-strong-scaling-256x24-1.pbs

!/bin/bash

PBS -N spate-strong-scaling-256x24-1

PBS -A CI-DEB000002

PBS -l walltime=1:00:00

PBS -l mppwidth=6144

cd $PBS_O_WORKDIR

aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o spate-strong-scaling-256x24-1 > spate-strong-scaling-256x24-1.stdout

Submission Beagle) qsub spate-strong-scaling-256x24-1.pbs 2815570.sdb

MachineUtilization Beagle) showq | grep 5570 2815570 sebhtml Running 6144 00:59:36 Sat Aug 16 17:38:43 Beagle) showq | grep "in use" 31 active jobs 13687 of 17496 processors in use by local jobs (78.23%)

ComputationLoad Beagle) grep COMPUTATION spate-strong-scaling-256x24-1.stdout |grep LOAD | grep -v " s " | tail [thorium] node/245 COMPUTATION LOAD 0.32 [thorium] node/205 COMPUTATION LOAD 0.25 [thorium] node/91 COMPUTATION LOAD 0.29 [thorium] node/248 COMPUTATION LOAD 0.33 [thorium] node/204 COMPUTATION LOAD 0.25 [thorium] node/90 COMPUTATION LOAD 0.28 [thorium] node/246 COMPUTATION LOAD 0.32 [thorium] node/94 COMPUTATION LOAD 0.29 [thorium] node/78 COMPUTATION LOAD 0.29 [thorium] node/0 COMPUTATION LOAD 0.29

RunningTime Beagle) grep TIMER spate-strong-scaling-256x24-1.stdout TIMER [Load input / Count input data] 9 minutes, 18.389282 seconds TIMER [Load input / Distribute input data] 4 minutes, 16.262482 seconds TIMER [Load input] 13 minutes, 34.651733 seconds TIMER [Build assembly graph / Distribute vertices] 7 minutes, 30.769745 seconds TIMER [Build assembly graph / Distribute arcs] 29 minutes, 53.003174 seconds TIMER [Build assembly graph] 37 minutes, 23.772949 seconds TIMER [Run actor computation] 50 minutes, 58.820557 seconds

MemoryUtilization Beagle) grep ByteCount spate-strong-scaling-256x24-1.stdout|tail [thorium] node/104 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20819595264 [thorium] node/96 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20695945216 [thorium] node/7 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20808585216 [thorium] node/191 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20567068672 [thorium] node/154 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20590317568 [thorium] node/155 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20695060480 [thorium] node/156 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20576014336 [thorium] node/145 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20597690368 [thorium] node/150 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20564381696 [thorium] node/151 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20681871360

Checksum Beagle) sha1sum spate-strong-scaling-256x24-1/coverage_distribution.txt-canonical 01a293db48518190038eaddbaed8a47ca0323fc7 spate-strong-scaling-256x24-1/coverage_distribution.txt-canonical

GoodComments BadComments

distribute arc goes from 19 min to 29 min when nodes goes from 199 to 256.

NeutralComments

sebhtml commented 10 years ago

JobName spate-strong-scaling-256x24-2

Machine login5.beagle.ci.uchicago.edu

AllocationStatus 237435 CI-CCR000040,CI-DEB000002

Path /lustre/beagle/CompBIO/biosal-THOR

Commit Beagle) git log|head -n1 commit 2d23c1d80e8ef24ac783e9aa93f2c3d2b711fc09

Toolchain PrgEnv-cray/4.2.24

Script Beagle) cat spate-strong-scaling-256x24-2.pbs

!/bin/bash

PBS -N spate-strong-scaling-256x24-2

PBS -A CI-DEB000002

PBS -l walltime=1:00:00

PBS -l mppwidth=6144

cd $PBS_O_WORKDIR aprun -n 256 -N 1 -d 24 \ spate -threads-per-node 24 -print-load \ -k 43 Iowa_Continuous_Corn/*.fastq -o spate-strong-scaling-256x24-2 > spate-strong-scaling-256x24-2.stdout

Submission Beagle) qsub spate-strong-scaling-256x24-2.pbs 2816154.sdb

MachineUtilization Beagle) showq | grep sebh 2816154 sebhtml Running 6144 00:59:46 Sat Aug 16 19:51:11 Beagle) showq | grep "in use" 26 active jobs 8983 of 17496 processors in use by local jobs (51.34%)

ComputationLoad Beagle) grep COMPUTATION spate-strong-scaling-256x24-2.stdout | grep LOAD | grep -v " s " | tail [thorium] node/242 COMPUTATION LOAD 0.50 [thorium] node/245 COMPUTATION LOAD 0.52 [thorium] node/243 COMPUTATION LOAD 0.50 [thorium] node/244 COMPUTATION LOAD 0.51 [thorium] node/184 COMPUTATION LOAD 0.53 [thorium] node/183 COMPUTATION LOAD 0.53 [thorium] node/182 COMPUTATION LOAD 0.53 [thorium] node/185 COMPUTATION LOAD 0.54 [thorium] node/15 COMPUTATION LOAD 0.52 [thorium] node/16 COMPUTATION LOAD 0.52

RunningTime Beagle) grep TIMER spate-strong-scaling-256x24-2.stdout TIMER [Load input / Count input data] 3 minutes, 26.491074 seconds TIMER [Load input / Distribute input data] 4 minutes, 5.586334 seconds TIMER [Load input] 7 minutes, 32.077423 seconds TIMER [Build assembly graph / Distribute vertices] 7 minutes, 26.598663 seconds TIMER [Build assembly graph / Distribute arcs] 18 minutes, 18.034180 seconds TIMER [Build assembly graph] 25 minutes, 44.632812 seconds TIMER [Run actor computation] 33 minutes, 17.844482 seconds

MemoryUtilization Beagle) grep ByteCount spate-strong-scaling-256x24-2.stdout | tail [thorium] node/225 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20666863616 [thorium] node/168 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20714115072 [thorium] node/27 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20873531392 [thorium] node/233 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20650512384 [thorium] node/29 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20665520128 [thorium] node/239 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20568936448 [thorium] node/30 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20840583168 [thorium] node/36 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20818513920 [thorium] node/38 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20781617152 [thorium] node/37 METRICS AliveActorCount: 0 ActiveRequestCount: 0 HeapByteCount: 20781453312

Checksum Beagle) sha1sum spate-strong-scaling-256x24-2/coverage_distribution.txt-canonical 01a293db48518190038eaddbaed8a47ca0323fc7 spate-strong-scaling-256x24-2/coverage_distribution.txt-canonical

GoodComments This is awesome.

BadComments NeutralComments