marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
654 stars 179 forks source link

Canu v1.7 assemble-errate 0.075 crash #985

Closed llalcll closed 6 years ago

llalcll commented 6 years ago

I've assembled the sequence step by step, and I have an issue at the second trim. error rate=0.039 command was successful complete. So, I don't know why this command couldn't run. Thank you

canu-1.7/Linux-amd64/bin/canu -assemble -p S194_assembly_errate-0.075 -d S194_Step_by_step_Assembly genomeSize=2.9m correctedErrorRate=0.075 -pacbio-corrected /home/lxadmin/S194_Step_by_step_Assembly/S194_trim.trimmedReads.fasta.gz
-- Canu 1.7
--
-- CITATIONS
--
-- Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM.
-- Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.
-- Genome Res. 2017 May;27(5):722-736.
-- http://doi.org/10.1101/gr.215087.116
--
-- Read and contig alignments during correction, consensus and GFA building use:
--   Šošic M, Šikic M.
--   Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
--   Bioinformatics. 2017 May 1;33(9):1394-1395.
--   http://doi.org/10.1093/bioinformatics/btw753
--
-- Overlaps are generated using:
--   Berlin K, et al.
--   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
--   Nat Biotechnol. 2015 Jun;33(6):623-30.
--   http://doi.org/10.1038/nbt.3238
--
--   Myers EW, et al.
--   A Whole-Genome Assembly of Drosophila.
--   Science. 2000 Mar 24;287(5461):2196-204.
--   http://doi.org/10.1126/science.287.5461.2196
--
--   Li H.
--   Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences.
--   Bioinformatics. 2016 Jul 15;32(14):2103-10.
--   http://doi.org/10.1093/bioinformatics/btw152
--
-- Corrected read consensus sequences are generated using an algorithm derived from FALCON-sense:
--   Chin CS, et al.
--   Phased diploid genome assembly with single-molecule real-time sequencing.
--   Nat Methods. 2016 Dec;13(12):1050-1054.
--   http://doi.org/10.1038/nmeth.4035
--
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
--   Chin CS, et al.
--   Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
--   Nat Methods. 2013 Jun;10(6):563-9
--   http://doi.org/10.1038/nmeth.2474
--
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '1.8.0_171' (from 'java').
-- Detected gnuplot version '5.2 patchlevel 2' (from 'gnuplot') and image format 'png'.
-- Detected 4 CPUs and 8 gigabytes of memory.
-- No grid engine detected, grid disabled.
--
--                            (tag)Concurrency
--                     (tag)Threads          |
--            (tag)Memory         |          |
--        (tag)         |         |          |     total usage     algorithm
--        -------  ------  --------   --------  -----------------  -----------------------------
-- Local: meryl      8 GB    4 CPUs x   1 job      8 GB    4 CPUs  (k-mer counting)
-- Local: cormhap    6 GB    4 CPUs x   1 job      6 GB    4 CPUs  (overlap detection with mhap)
-- Local: obtovl     4 GB    4 CPUs x   1 job      4 GB    4 CPUs  (overlap detection)
-- Local: utgovl     4 GB    4 CPUs x   1 job      4 GB    4 CPUs  (overlap detection)
-- Local: ovb        2 GB    1 CPU  x   4 jobs     8 GB    4 CPUs  (overlap store bucketizer)
-- Local: ovs        8 GB    1 CPU  x   1 job      8 GB    1 CPU   (overlap store sorting)
-- Local: red        4 GB    4 CPUs x   1 job      4 GB    4 CPUs  (read error detection)
-- Local: oea        4 GB    1 CPU  x   2 jobs     8 GB    2 CPUs  (overlap error adjustment)
-- Local: bat        8 GB    4 CPUs x   1 job      8 GB    4 CPUs  (contig construction)
-- Local: gfa        8 GB    4 CPUs x   1 job      8 GB    4 CPUs  (GFA alignment and processing)
--
-- Found PacBio corrected reads in the input files.
--
-- Generating assembly 'S194_assembly_errate-0.075' in '/home/lxadmin/S194_Step_by_step_Assembly'
--
-- Parameters:
--
--  genomeSize        2900000
--
--  Overlap Generation Limits:
--    corOvlErrorRate 0.2400 ( 24.00%)
--    obtOvlErrorRate 0.0750 (  7.50%)
--    utgOvlErrorRate 0.0750 (  7.50%)
--
--  Overlap Processing Limits:
--    corErrorRate    0.3000 ( 30.00%)
--    obtErrorRate    0.0750 (  7.50%)
--    utgErrorRate    0.0750 (  7.50%)
--    cnsErrorRate    0.0750 (  7.50%)
--
--
-- BEGIN ASSEMBLY
--
----------------------------------------
-- Starting command on Mon Jul  2 17:27:54 2018 with 54.372 GB free disk space

    cd .
    /home/lxadmin/canu-1.7/Linux-amd64/bin/gatekeeperCreate \
      -minlength 1000 \
      -o ./S194_assembly_errate-0.075.gkpStore.BUILDING \
      ./S194_assembly_errate-0.075.gkpStore.gkp \
    > ./S194_assembly_errate-0.075.gkpStore.BUILDING.err 2>&1

-- Finished on Mon Jul  2 17:27:56 2018 (2 seconds) with 54.345 GB free disk space
----------------------------------------
--
-- WARNING:  No trimmed reads found for assembly, but untrimmed reads exist.
-- WARNING:  Upgrading untrimmed reads to trimmed reads for assembly.
--
-- In gatekeeper store './S194_assembly_errate-0.075.gkpStore':
--   Found 5433 reads.
--   Found 107128533 bases (36.94 times coverage).
--
--   Read length histogram (one '*' equals 11.17 reads):
--        0    999      0
--     1000   1999    328 *****************************
--     2000   2999     55 ****
--     3000   3999     19 *
--     4000   4999     13 *
--     5000   5999      7
--     6000   6999     11
--     7000   7999      5
--     8000   8999     16 *
--     9000   9999     15 *
--    10000  10999     29 **
--    11000  11999     35 ***
--    12000  12999     26 **
--    13000  13999     47 ****
--    14000  14999     45 ****
--    15000  15999     94 ********
--    16000  16999    181 ****************
--    17000  17999    551 *************************************************
--    18000  18999    782 **********************************************************************
--    19000  19999    562 **************************************************
--    20000  20999    519 **********************************************
--    21000  21999    413 ************************************
--    22000  22999    309 ***************************
--    23000  23999    257 ***********************
--    24000  24999    223 *******************
--    25000  25999    171 ***************
--    26000  26999    136 ************
--    27000  27999    115 **********
--    28000  28999    101 *********
--    29000  29999     74 ******
--    30000  30999     60 *****
--    31000  31999     51 ****
--    32000  32999     51 ****
--    33000  33999     29 **
--    34000  34999     20 *
--    35000  35999     16 *
--    36000  36999     15 *
--    37000  37999      7
--    38000  38999      8
--    39000  39999      7
--    40000  40999      3
--    41000  41999      9
--    42000  42999      7
--    43000  43999      3
--    44000  44999      0
--    45000  45999      2
--    46000  46999      4
--    47000  47999      1
--    48000  48999      0
--    49000  49999      0
--    50000  50999      0
--    51000  51999      0
--    52000  52999      0
--    53000  53999      0
--    54000  54999      1

CRASH:
CRASH: Canu 1.7
CRASH: Please panic, this is abnormal.
ABORT:
CRASH:   failed to read estimated mer threshold from 'unitigging/0-mercounts/S194_assembly_errate-0.075.ms22.estMerThresh.out'.
CRASH:
CRASH: Failed at /home/lxadmin/canu-1.7/Linux-amd64/bin/../lib/site_perl/canu/Meryl.pm line 678.
CRASH:  canu::Meryl::merylProcess("S194_assembly_errate-0.075", "utg") called at canu-1.7/Linux-amd64/bin/canu line 678
CRASH:
CRASH: No log file supplied.
CRASH:
skoren commented 6 years ago

The error is unrelated to error rate, it looks like you had meryl output already but not the expected threshold file. Are you giving the run a new directory via the -d option? What's in the unitigging/0-mercounts folder and in meryl.*.out.

brianwalenz commented 6 years ago

No reply. :-(