marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
660 stars 179 forks source link

Illegal division by zero..... Execution.pm line 1441 #2326

Closed JosephHeras closed 4 months ago

JosephHeras commented 4 months ago

I've been running canu on a fish genome ~600 MB, and now I keep getting Illegal division by zero... I'm using a linux server and canu snapshot v2.3. Here's my command:

/data/canu/build/bin/canu -p XM_CANU24 -d XM_CANU24_FINAL genomeSize=600m correctedErrorRate=0.035 -pacbio-raw /data/Prickleback_Genomes/R251-A01.1-Reads.fastq useGrid=false -corMinCoverage=4 mhapPipe=false

This is the output:

-- CITATIONS
--
-- For 'standard' assemblies of PacBio or Nanopore reads:
--   Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM.
--   Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.
--   Genome Res. 2017 May;27(5):722-736.
--   http://doi.org/10.1101/gr.215087.116
-- 
-- Read and contig alignments during correction and consensus use:
--   Šošic M, Šikic M.
--   Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
--   Bioinformatics. 2017 May 1;33(9):1394-1395.
--   http://doi.org/10.1093/bioinformatics/btw753
-- 
-- Overlaps are generated using:
--   Berlin K, et al.
--   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
--   Nat Biotechnol. 2015 Jun;33(6):623-30.
--   http://doi.org/10.1038/nbt.3238
-- 
--   Myers EW, et al.
--   A Whole-Genome Assembly of Drosophila.
--   Science. 2000 Mar 24;287(5461):2196-204.
--   http://doi.org/10.1126/science.287.5461.2196
-- 
-- Corrected read consensus sequences are generated using an algorithm derived from FALCON-sense:
--   Chin CS, et al.
--   Phased diploid genome assembly with single-molecule real-time sequencing.
--   Nat Methods. 2016 Dec;13(12):1050-1054.
--   http://doi.org/10.1038/nmeth.4035
-- 
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
--   Chin CS, et al.
--   Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
--   Nat Methods. 2013 Jun;10(6):563-9
--   http://doi.org/10.1038/nmeth.2474
-- 
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '11.0.17' (from 'java') without -d64 support.
--
-- WARNING:
-- WARNING:  Failed to run gnuplot using command 'gnuplot'.
-- WARNING:  Plots will be disabled.
-- WARNING:
--
--
-- Detected 128 CPUs and 251 gigabytes of memory on the local machine.
--
-- Local machine mode enabled; grid support not detected or not allowed.
--
--                                (tag)Concurrency
--                         (tag)Threads          |
--                (tag)Memory         |          |
--        (tag)             |         |          |       total usage      algorithm
--        -------  ----------  --------   --------  --------------------  -----------------------------
-- Local: meryl     15.000 GB    8 CPUs x  16 jobs   240.000 GB 128 CPUs  (k-mer counting)
-- Local: hap       12.000 GB   16 CPUs x   8 jobs    96.000 GB 128 CPUs  (read-to-haplotype assignment)
-- Local: cormhap   31.000 GB   16 CPUs x   8 jobs   248.000 GB 128 CPUs  (overlap detection with mhap)
-- Local: obtovl    16.000 GB   16 CPUs x   8 jobs   128.000 GB 128 CPUs  (overlap detection)
-- Local: utgovl    16.000 GB   16 CPUs x   8 jobs   128.000 GB 128 CPUs  (overlap detection)
-- Local: cor        -.--- GB    4 CPUs x   - jobs     -.--- GB   - CPUs  (read correction)
-- Local: ovb        4.000 GB    1 CPU  x  62 jobs   248.000 GB  62 CPUs  (overlap store bucketizer)
-- Local: ovs       16.000 GB    1 CPU  x  15 jobs   240.000 GB  15 CPUs  (overlap store sorting)
-- Local: red       16.000 GB    8 CPUs x  15 jobs   240.000 GB 120 CPUs  (read error detection)
-- Local: oea        8.000 GB    1 CPU  x  31 jobs   248.000 GB  31 CPUs  (overlap error adjustment)
-- Local: bat      251.000 GB   16 CPUs x   1 job    251.000 GB  16 CPUs  (contig construction with bogart)
-- Local: cns        -.--- GB    8 CPUs x   - jobs     -.--- GB   - CPUs  (consensus)
--
-- Found PacBio CLR reads in 'XM_CANU24.seqStore':
--   Libraries:
--     PacBio CLR:            1
--   Reads:
--     Raw:                   107552380702
--
--
-- Generating assembly 'XM_CANU24' in '/data/XM_CANU24_FOLDER/XM_CANU24_FINAL':
--   genomeSize:
--     600000000
--
--   Overlap Generation Limits:
--     corOvlErrorRate 0.2400 ( 24.00%)
--     obtOvlErrorRate 0.0350 (  3.50%)
--     utgOvlErrorRate 0.0350 (  3.50%)
--
--   Overlap Processing Limits:
--     corErrorRate    0.2500 ( 25.00%)
--     obtErrorRate    0.0350 (  3.50%)
--     utgErrorRate    0.0350 (  3.50%)
--     cnsErrorRate    0.0350 (  3.50%)
--
--   Stages to run:
--     correct raw reads.
--     trim corrected reads.
--     assemble corrected and trimmed reads.
--
--
-- BEGIN CORRECTION
--
-- Running jobs.  First attempt out of 2.
Illegal division by zero at /data/canu/build/bin/../lib/site_perl/canu/Execution.pm line 1441.

Any suggestions?

skoren commented 4 months ago

It looks like there is a job that can't figure out its memory request. It doesn't look like this is the first time you ran canu on that folder which makes it hard to see exactly how far the assembly got. Can you completely remove the XM_CANU24_FINAL folder and run from scratch and post all the output from the run along with the report file. I'd also suggest using the released 2.2 version.

skoren commented 4 months ago

Idle, no reply.