marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
658 stars 179 forks source link

canu stops after 1-overlapper step #380

Closed amit4mchiba closed 7 years ago

amit4mchiba commented 7 years ago

Hi,

I donot know why after 1-overlapper step, canu stops without giving me any error or any message for me to understand what has happened. Please help. This is the command i used-

#!/bin/sh
#$ -S /bin/sh
#$ -l month -l phi
#$ -N canu5-phi-low03
#$ -pe def_slot 5-100
#$ -cwd

/home/amit-rai8chiba/canu-master/canu/Linux-amd64/bin/canu -p G_13thFeb-phi-low03 -d ./G_13thFeb-phi-low03 \
genomeSize=400m \
-pacbio-raw /home/amit-rai8chiba/G_pcdata/data/G_all_cat.fastq.gz \
useGrid=0 \
corMhapSensitivity=low \
rawErrorRate=0.3 \
corMinCoverage=0 \
corOutCoverage=200 \
correctedErrorRate=0.05 \
minReadLength=1000 \
minOverlapLength=300 \
utgGraphDeviation=50 \
minThreads=22 \

Here is the canu-log file-

-- Detected Java(TM) Runtime Environment '1.8.0_45' (from 'java').
-- Detected gnuplot version '4.4 patchlevel 4' (from 'gnuplot') and image format 'png'.
--
-- Generating assembly 'G_13thFeb-phi-low03' in '/lustre3/home/amit-rai8chiba/G_pcdata/G-assembly_13thFeb-phi-low03'
--
-- Detected 20 CPUs and 63 gigabytes of memory.
-- Detected Sun Grid Engine in '/home/geadmin2/UGER/uger'.
-- Grid engine disabled per useGrid=false option.
--
-- Run   2 jobs concurrently using   31 GB and  10 CPUs for stage 'meryl'.
-- Run   2 jobs concurrently using   13 GB and  10 CPUs for stage 'mhap (cor)'.
-- Run   4 jobs concurrently using    8 GB and   5 CPUs for stage 'overlapper (obt)'.
-- Run   4 jobs concurrently using    8 GB and   5 CPUs for stage 'overlapper (utg)'.
-- Run   5 jobs concurrently using   12 GB and   4 CPUs for stage 'falcon_sense'.
-- Run  20 jobs concurrently using    3 GB and   1 CPU  for stage 'ovStore bucketizer'.
-- Run  20 jobs concurrently using   16 GB and   1 CPU  for stage 'ovStore sorting'.
-- Run   4 jobs concurrently using    6 GB and   5 CPUs for stage 'read error detection'.
-- Run  20 jobs concurrently using    2 GB and   1 CPU  for stage 'overlap error adjustment'.
-- Run   3 jobs concurrently using   21 GB and   6 CPUs for stage 'bogart'.
-- Run   3 jobs concurrently using   21 GB and   6 CPUs for stage 'consensus'.
--
-- Parameters:
--
--  genomeSize        400000000
--
--  Overlap Generation Limits:
--    corOvlErrorRate 0.3000 ( 30.00%)
--    obtOvlErrorRate 0.0500 (  5.00%)
--    utgOvlErrorRate 0.0500 (  5.00%)
--
--  Overlap Processing Limits:
--    corErrorRate    0.3000 ( 30.00%)
--    obtErrorRate    0.0500 (  5.00%)
--    utgErrorRate    0.0500 (  5.00%)
--    cnsErrorRate    0.0500 (  5.00%)
--
-- This is canu parallel iteration #1, out of a maximum of 2 attempts.
--
--
-- BEGIN CORRECTION
--
----------------------------------------
-- Starting command on Wed Feb 22 11:07:09 2017 with 993120.284 GB free disk space

    cd correction
    /lustre3/home/amit-rai8chiba/canu-master/canu/Linux-amd64/bin/gatekeeperCreate \
      -minlength 1000 \
      -o ./G_13thFeb-phi-low03.gkpStore.BUILDING \
      ./G_13thFeb-phi-low03.gkpStore.gkp \
    > ./G_13thFeb-phi-low03.gkpStore.BUILDING.err 2>&1

-- Finished on Wed Feb 22 11:12:03 2017 (294 seconds) with 993125.251 GB free disk space
----------------------------------------
--
-- In gatekeeper store 'correction/G_13thFeb-phi-low03.gkpStore':
--   Found 2062248 reads.
--   Found 15683778047 bases (39.2 times coverage).
--
--   Read length histogram (one '*' equals 4100.88 reads):
--        0    999      0 
--     1000   1999 287062 **********************************************************************
--     2000   2999 207405 **************************************************
--     3000   3999 164426 ****************************************
--     4000   4999 147288 ***********************************
--     5000   5999 139420 *********************************
--     6000   6999 131337 ********************************
--     7000   7999 123189 ******************************
--     8000   8999 115443 ****************************
--     9000   9999 110403 **************************
--    10000  10999 109642 **************************
--    11000  11999 107641 **************************
--    12000  12999  97229 ***********************
--    13000  13999  78660 *******************
--    14000  14999  60026 **************
--    15000  15999  45153 ***********
--    16000  16999  33803 ********
--    17000  17999  25163 ******
--    18000  18999  19058 ****
--    19000  19999  13988 ***
--    20000  20999  10740 **
--    21000  21999   8257 **
--    22000  22999   6381 *
--    23000  23999   4897 *
--    24000  24999   3735 
--    25000  25999   2863 
--    26000  26999   2232 
--    27000  27999   1654 
--    28000  28999   1237 
--    29000  29999    983 
--    30000  30999    709 
--    31000  31999    589 
--    32000  32999    465 
--    33000  33999    344 
--    34000  34999    257 
--    35000  35999    176 
--    36000  36999    111 
--    37000  37999     95 
--    38000  38999     63 
--    39000  39999     48 
--    40000  40999     20 
--    41000  41999     28 
--    42000  42999      9 
--    43000  43999      7 
--    44000  44999      5 
--    45000  45999      3 
--    46000  46999      1 
--    47000  47999      1 
--    48000  48999      0 
--    49000  49999      1 
--    50000  50999      0 
--    51000  51999      1 
-- Meryl attempt 1 begins.
----------------------------------------
-- Starting concurrent execution on Wed Feb 22 11:12:32 2017 with 993124.284 GB free disk space (1 processes; 2 concurrently)

    cd correction/0-mercounts
    ./meryl.sh 1 > ./meryl.000001.out 2>&1

-- Finished on Wed Feb 22 11:33:48 2017 (1276 seconds) with 993059.653 GB free disk space
----------------------------------------
-- Meryl finished successfully.
----------------------------------------
-- Starting command on Wed Feb 22 11:33:48 2017 with 993059.653 GB free disk space

    cd correction/0-mercounts
    /lustre3/home/amit-rai8chiba/canu-master/canu/Linux-amd64/bin/meryl \
      -Dh \
      -s ./G_13thFeb-phi-low03.ms16 \
    > ./G_13thFeb-phi-low03.ms16.histogram \
    2> ./G_13thFeb-phi-low03.ms16.histogram.info

-- Finished on Wed Feb 22 11:33:48 2017 (lickety-split) with 993059.653 GB free disk space
----------------------------------------
-- For mhap overlapping, set repeat k-mer threshold to 156528.
--
-- Found 15652844327 16-mers; 1879107127 distinct and 366090235 unique.  Largest count 4714635.
--
-- OVERLAPPER (mhap) (correction)
--
--
-- PARAMETERS: hashes=256, minMatches=3, threshold=0.8
--
-- Given 13 GB, can fit 39000 reads per block.
-- For 54 blocks, set stride to 13 blocks.
-- Logging partitioning to 'correction/1-overlapper/partitioning.log'.
-- Configured 53 mhap precompute jobs.
-- Configured 131 mhap overlap jobs.
-- mhap precompute attempt 1 begins with 0 finished, and 53 to compute.
----------------------------------------
-- Starting concurrent execution on Wed Feb 22 11:35:00 2017 with 993087.142 GB free disk space (53 processes; 2 concurrently)

    cd correction/1-overlapper
    ./precompute.sh 1 > ./precompute.000001.out 2>&1
    ./precompute.sh 2 > ./precompute.000002.out 2>&1
...
    ./precompute.sh 52 > ./precompute.000052.out 2>&1
    ./precompute.sh 53 > ./precompute.000053.out 2>&1

-- Finished on Wed Feb 22 15:09:08 2017 (12848 seconds) with 992901.693 GB free disk space
----------------------------------------
-- All 53 mhap precompute jobs finished successfully.
-- mhap attempt 1 begins with 0 finished, and 131 to compute.
----------------------------------------
-- Starting concurrent execution on Wed Feb 22 15:09:09 2017 with 992901.693 GB free disk space (131 processes; 2 concurrently)

    cd correction/1-overlapper
    ./mhap.sh 1 > ./mhap.000001.out 2>&1
    ./mhap.sh 2 > ./mhap.000002.out 2>&1
...
    ./mhap.sh 130 > ./mhap.000130.out 2>&1
    ./mhap.sh 131 > ./mhap.000131.out 2>&1

-- Finished on Wed Feb 22 18:26:02 2017 (11813 seconds) with 992510.78 GB free disk space
----------------------------------------
-- Found 131 mhap overlap output files.
----------------------------------------
-- Starting command on Wed Feb 22 18:26:03 2017 with 992510.78 GB free disk space

    cd correction
    ./G_13thFeb-phi-low03.ovlStore.BUILDING/scripts/0-config.sh \
    > ./G_13thFeb-phi-low03.ovlStore.BUILDING/config.err 2>&1

-- Finished on Wed Feb 22 18:26:06 2017 (3 seconds) with 992510.028 GB free disk space
----------------------------------------
-- overlap store bucketizer attempt 1 begins with 0 finished, and 131 to compute.
----------------------------------------
-- Starting concurrent execution on Wed Feb 22 18:26:06 2017 with 992510.028 GB free disk space (131 processes; 20 concurrently)

    cd correction/G_13thFeb-phi-low03.ovlStore.BUILDING
    ./scripts/1-bucketize.sh 1 > ./scripts/1-bucketize.000001.out 2>&1
    ./scripts/1-bucketize.sh 2 > ./scripts/1-bucketize.000002.out 2>&1
...
    ./scripts/1-bucketize.sh 130 > ./scripts/1-bucketize.000130.out 2>&1
    ./scripts/1-bucketize.sh 131 > ./scripts/1-bucketize.000131.out 2>&1

-- Finished on Wed Feb 22 18:27:06 2017 (60 seconds) with 992438.907 GB free disk space
----------------------------------------
-- Overlap store bucketizer finished.
-- overlap store sorter attempt 1 begins with 0 finished, and 26 to compute.
----------------------------------------
-- Starting concurrent execution on Wed Feb 22 18:27:06 2017 with 992438.907 GB free disk space (26 processes; 20 concurrently)

    cd correction/G_13thFeb-phi-low03.ovlStore.BUILDING
    ./scripts/2-sort.sh 1 > ./scripts/2-sort.000001.out 2>&1
    ./scripts/2-sort.sh 2 > ./scripts/2-sort.000002.out 2>&1
    ./scripts/2-sort.sh 3 > ./scripts/2-sort.000003.out 2>&1
    ./scripts/2-sort.sh 4 > ./scripts/2-sort.000004.out 2>&1
    ./scripts/2-sort.sh 5 > ./scripts/2-sort.000005.out 2>&1
    ./scripts/2-sort.sh 6 > ./scripts/2-sort.000006.out 2>&1
    ./scripts/2-sort.sh 7 > ./scripts/2-sort.000007.out 2>&1
    ./scripts/2-sort.sh 8 > ./scripts/2-sort.000008.out 2>&1
    ./scripts/2-sort.sh 9 > ./scripts/2-sort.000009.out 2>&1
    ./scripts/2-sort.sh 10 > ./scripts/2-sort.000010.out 2>&1
    ./scripts/2-sort.sh 11 > ./scripts/2-sort.000011.out 2>&1
    ./scripts/2-sort.sh 12 > ./scripts/2-sort.000012.out 2>&1
    ./scripts/2-sort.sh 13 > ./scripts/2-sort.000013.out 2>&1
    ./scripts/2-sort.sh 14 > ./scripts/2-sort.000014.out 2>&1
    ./scripts/2-sort.sh 15 > ./scripts/2-sort.000015.out 2>&1
    ./scripts/2-sort.sh 16 > ./scripts/2-sort.000016.out 2>&1
    ./scripts/2-sort.sh 17 > ./scripts/2-sort.000017.out 2>&1
    ./scripts/2-sort.sh 18 > ./scripts/2-sort.000018.out 2>&1
    ./scripts/2-sort.sh 19 > ./scripts/2-sort.000019.out 2>&1
    ./scripts/2-sort.sh 20 > ./scripts/2-sort.000020.out 2>&1
brianwalenz commented 7 years ago

That's it? It looks like it might have been killed by your grid. Notice it's running 20 of those 'sort' jobs at the same time, and that's all that is shown. Each job will be using around 16gb memory - so 320gb memory total, but it's on a node with only 63gb memory. Try setting ovsConcurrency=3 (ovs meaning 'overlap store sorting').