marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
658 stars 179 forks source link

canu stop when running Configured consensus jobs #386

Closed maolingfengZJU closed 7 years ago

maolingfengZJU commented 7 years ago

The canu ran ok until the third stage "Assemble", my command is canu -p all_pacbio -d all_pacbio genomeSize=1.17g -pacbio-raw all_pacbio.fastq >> 1.log and my log file is

-- Detected Java(TM) Runtime Environment '1.8.0_112' (from '/workcenters/workcenter1/softwares/jdk1.8.0_112/bin/java').
-- Detected 64 CPUs and 997 gigabytes of memory.
-- No grid engine detected, grid disabled.
--
-- Allowed to run   4 jobs concurrently, and use up to  16 compute threads and  249 GB memory for stage 'bogart (unitigger)'.
-- Allowed to run   4 jobs concurrently, and use up to  16 compute threads and   32 GB memory for stage 'mhap (overlapper)'.
-- Allowed to run   4 jobs concurrently, and use up to  16 compute threads and   32 GB memory for stage 'mhap (overlapper)'.
-- Allowed to run   4 jobs concurrently, and use up to  16 compute threads and   32 GB memory for stage 'mhap (overlapper)'.
-- Allowed to run   8 jobs concurrently, and use up to   8 compute threads and    8 GB memory for stage 'read error detection (overlap error adjustment)'.
-- Allowed to run  64 jobs concurrently, and use up to   1 compute thread  and    2 GB memory for stage 'overlap error adjustment'.
-- Allowed to run   8 jobs concurrently, and use up to   8 compute threads and  124 GB memory for stage 'utgcns (consensus'.
-- Allowed to run  64 jobs concurrently, and use up to   1 compute thread  and    4 GB memory for stage 'overlap store parallel bucketizer'.
-- Allowed to run  64 jobs concurrently, and use up to   1 compute thread  and   32 GB memory for stage 'overlap store parallel sorting'.
-- Allowed to run  64 jobs concurrently, and use up to   1 compute thread  and    8 GB memory for stage 'overlapper'.
-- Allowed to run   8 jobs concurrently, and use up to   8 compute threads and   12 GB memory for stage 'overlapper'.
-- Allowed to run   8 jobs concurrently, and use up to   8 compute threads and   12 GB memory for stage 'overlapper'.
-- Allowed to run   2 jobs concurrently, and use up to  32 compute threads and  256 GB memory for stage 'meryl (k-mer counting)'.
-- Allowed to run  16 jobs concurrently, and use up to   4 compute threads and   32 GB memory for stage 'falcon_sense (read correction)'.
-- Allowed to run   4 jobs concurrently, and use up to  16 compute threads and   32 GB memory for stage 'minimap (overlapper)'.
-- Allowed to run   4 jobs concurrently, and use up to  16 compute threads and   32 GB memory for stage 'minimap (overlapper)'.
-- Allowed to run   4 jobs concurrently, and use up to  16 compute threads and   32 GB memory for stage 'minimap (overlapper)'.
--
-- This is canu parallel iteration #1, out of a maximum of 2 attempts.
--
-- Final error rates before starting pipeline:
--
--   genomeSize          -- 1170000000
--   errorRate           -- 0.025
--
--   corOvlErrorRate     -- 0.075
--   obtOvlErrorRate     -- 0.075
--   utgOvlErrorRate     -- 0.075
--
--   obtErrorRate        -- 0.075
--
--   cnsErrorRate        -- 0.075
--
--
-- BEGIN CORRECTION
--
--
-- Corrected reads saved in '/home2/mlf/Brassica/Canu/all_pacbio/all_pacbio.correctedReads.fasta'.
--
--
-- BEGIN TRIMMING
--
--
-- Trimmed reads saved in '/home2/mlf/Brassica/Canu/all_pacbio/all_pacbio.trimmedReads.fasta.gz'
--
--
-- BEGIN ASSEMBLY
--
-- Configured 253 consensus jobs.
-- Consensus attempt 1 begins with 16 finished, and 237 to compute.
----------------------------------------
-- Starting concurrent execution on Fri Feb 24 13:39:45 2017 with 6900.4 GB free disk space (237 processes; 8 concurrently)

    /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.sh 14 > /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.000014.out 2>&1
    /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.sh 16 > /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.000016.out 2>&1
    /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.sh 18 > /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.000018.out 2>&1
    /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.sh 19 > /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.000019.out 2>&1
    /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.sh 21 > /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.000021.out 2>&1
    /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.sh 22 > /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.000022.out 2>&1
    /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.sh 23 > /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.000023.out 2>&1
    /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.sh 24 > /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/consensus.000024.out 2>&1
brianwalenz commented 7 years ago

What version of canu is this?

Look in the various consensus.*.out files for errors.

The log you show isn't indicating any error, just that 8 jobs are running. Errors would result in all 237 jobs being listed, then a second attempt by canu to run the jobs, then canu failing and noting that some number of the jobs failed to complete.

maolingfengZJU commented 7 years ago

thank you for reply so quickly, the version of canu is 1.3, and i found no errors or warnings in the various consensus..out files but with "Consensus finished successfully. Bye." in the tail of consensus..out files when the file *.cns was generated.

Best wishes, Lingfeng MAO

skoren commented 7 years ago

Are the consensus jobs still running on your system (if you try top)? Does Canu keep outputting consensus.sh commands, it should run 8 at a time until they all finish.

From your logs, nothing indicates an error, perhaps there was some issue on the system that caused some of the jobs to fail and Canu is retrying them.

maolingfengZJU commented 7 years ago

Not show in the top, all the utgcns jobs were in the Sleep, with the command ' ps aux|grep "utgcns" ', all the utgcns jobs can not run as normal.

root     18851  0.0  0.0 715716 217316 pts/1   Sl   Feb24   0:13 /workcenters/workcenter1/softwares/genome_assembly/canu/canu-1.3/Linux-amd64/bin/utgcns -G /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.gkpStore -T /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.tigStore 1 0014 -O /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/0014.cns.WORKING -maxcoverage 40 -e 0.075 -pbdagcon -threads 8
root     18852  0.0  0.0 695816 190048 pts/1   Sl   Feb24   0:01 /workcenters/workcenter1/softwares/genome_assembly/canu/canu-1.3/Linux-amd64/bin/utgcns -G /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.gkpStore -T /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.tigStore 1 0011 -O /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/0011.cns.WORKING -maxcoverage 40 -e 0.075 -pbdagcon -threads 8
root     18855  0.0  0.0 722448 211584 pts/1   Sl   Feb24   0:19 /workcenters/workcenter1/softwares/genome_assembly/canu/canu-1.3/Linux-amd64/bin/utgcns -G /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.gkpStore -T /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.tigStore 1 0018 -O /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/0018.cns.WORKING -maxcoverage 40 -e 0.075 -pbdagcon -threads 8
root     18856  0.0  0.0 695580 196792 pts/1   Sl   Feb24   0:02 /workcenters/workcenter1/softwares/genome_assembly/canu/canu-1.3/Linux-amd64/bin/utgcns -G /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.gkpStore -T /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.tigStore 1 0016 -O /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/0016.cns.WORKING -maxcoverage 40 -e 0.075 -pbdagcon -threads 8
root     18857  0.2  0.0 718824 216092 pts/1   Sl   Feb24   1:43 /workcenters/workcenter1/softwares/genome_assembly/canu/canu-1.3/Linux-amd64/bin/utgcns -G /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.gkpStore -T /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.tigStore 1 0019 -O /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/0019.cns.WORKING -maxcoverage 40 -e 0.075 -pbdagcon -threads 8
root     18858  0.2  0.0 750952 258772 pts/1   Sl   Feb24   1:31 /workcenters/workcenter1/softwares/genome_assembly/canu/canu-1.3/Linux-amd64/bin/utgcns -G /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.gkpStore -T /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.tigStore 1 0022 -O /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/0022.cns.WORKING -maxcoverage 40 -e 0.075 -pbdagcon -threads 8
root     18859  0.2  0.0 717832 212520 pts/1   Sl   Feb24   1:43 /workcenters/workcenter1/softwares/genome_assembly/canu/canu-1.3/Linux-amd64/bin/utgcns -G /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.gkpStore -T /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.tigStore 1 0021 -O /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/0021.cns.WORKING -maxcoverage 40 -e 0.075 -pbdagcon -threads 8
root     18860  0.1  0.0 720520 210860 pts/1   Sl   Feb24   0:50 /workcenters/workcenter1/softwares/genome_assembly/canu/canu-1.3/Linux-amd64/bin/utgcns -G /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.gkpStore -T /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.tigStore 1 0023 -O /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/0023.cns.WORKING -maxcoverage 40 -e 0.075 -pbdagcon -threads 8
root     20524  0.1  0.0 724132 221432 pts/1   Sl   Feb24   0:46 /workcenters/workcenter1/softwares/genome_assembly/canu/canu-1.3/Linux-amd64/bin/utgcns -G /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.gkpStore -T /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.tigStore 1 0021 -O /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/0021.cns.WORKING -maxcoverage 40 -e 0.075 -pbdagcon -threads 8
root     20528  0.2  0.0 721220 214148 pts/1   Sl   Feb24   1:31 /workcenters/workcenter1/softwares/genome_assembly/canu/canu-1.3/Linux-amd64/bin/utgcns -G /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.gkpStore -T /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.tigStore 1 0022 -O /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/0022.cns.WORKING -maxcoverage 40 -e 0.075 -pbdagcon -threads 8
root     20533  0.1  0.0 712844 209484 pts/1   Sl   Feb24   1:00 /workcenters/workcenter1/softwares/genome_assembly/canu/canu-1.3/Linux-amd64/bin/utgcns -G /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.gkpStore -T /home2/mlf/Brassica/Canu/all_pacbio/unitigging/all_pacbio.tigStore 1 0023 -O /home2/mlf/Brassica/Canu/all_pacbio/unitigging/5-consensus/0023.cns.WORKING -maxcoverage 40 -e 0.075 -pbdagcon -threads 8
skoren commented 7 years ago

Sleeping indicates the jobs are waiting for a request to complete, perhaps I/O. Is the memory on your system full or do you have free memory available? Is the I/O saturated? You could try reducing the number of jobs with cnsConcurrency=2.