marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
660 stars 179 forks source link

Issue with k-mer counting (meryl-count) in assembly phase #1920

Closed ghost closed 3 years ago

ghost commented 3 years ago

Hello,

I'm having an issue with the meryl-count part of the assembly phase. Here is my canu.out:

Found perl:
   /cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/perl/5.30.2/bin/perl
   This is perl 5, version 30, subversion 2 (v5.30.2) built for x86_64-linux-thread-multi

Found java:
   /cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/bin/java
   openjdk version "13.0.2" 2020-01-14

Found canu:
   /cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/canu/2.1.1/bin/canu
   canu 2.1.1

-- canu 2.1.1
--
-- CITATIONS
--
-- For 'standard' assemblies of PacBio or Nanopore reads:
--   Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM.
--   Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.
--   Genome Res. 2017 May;27(5):722-736.
--   http://doi.org/10.1101/gr.215087.116
-- 
-- Read and contig alignments during correction and consensus use:
--   Šošic M, Šikic M.
--   Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
--   Bioinformatics. 2017 May 1;33(9):1394-1395.
--   http://doi.org/10.1093/bioinformatics/btw753
-- 
-- Overlaps are generated using:
--   Berlin K, et al.
--   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
--   Nat Biotechnol. 2015 Jun;33(6):623-30.
--   http://doi.org/10.1038/nbt.3238
-- 
--   Myers EW, et al.
--   A Whole-Genome Assembly of Drosophila.
--   Science. 2000 Mar 24;287(5461):2196-204.
--   http://doi.org/10.1126/science.287.5461.2196
-- 
-- Corrected read consensus sequences are generated using an algorithm derived from FALCON-sense:
--   Chin CS, et al.
--   Phased diploid genome assembly with single-molecule real-time sequencing.
--   Nat Methods. 2016 Dec;13(12):1050-1054.
--   http://doi.org/10.1038/nmeth.4035
-- 
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
--   Chin CS, et al.
--   Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
--   Nat Methods. 2013 Jun;10(6):563-9
--   http://doi.org/10.1038/nmeth.2474
-- 
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '13.0.2' (from '/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/bin/java') without -d64 support.
-- Detected gnuplot version '5.2 patchlevel 8   ' (from 'gnuplot') and image format 'png'.
-- Detected 48 CPUs and 188 gigabytes of memory.
-- Limited to 16 CPUs from maxThreads option.
-- Detected Slurm with 'sinfo' binary in /opt/software/slurm/bin/sinfo.
-- Detected Slurm with task IDs up to 9999 allowed.
-- 
-- Found  24 hosts with  32 cores and  502 GB memory under Slurm control.
-- Found 1440 hosts with  48 cores and  187 GB memory under Slurm control.
-- Found   4 hosts with  48 cores and  375 GB memory under Slurm control.
-- Found 192 hosts with  32 cores and  187 GB memory under Slurm control.
-- Found   4 hosts with  32 cores and 3021 GB memory under Slurm control.
-- Found  94 hosts with  32 cores and  250 GB memory under Slurm control.
-- Found 639 hosts with  32 cores and  124 GB memory under Slurm control.
-- Found  24 hosts with  32 cores and 1510 GB memory under Slurm control.
-- Found  32 hosts with  24 cores and  250 GB memory under Slurm control.
-- Found 114 hosts with  24 cores and  124 GB memory under Slurm control.
--
--                         (tag)Threads
--                (tag)Memory         |
--        (tag)             |         |  algorithm
--        -------  ----------  --------  -----------------------------
-- Grid:  meryl     45.000 GB    6 CPUs  (k-mer counting)
-- Grid:  hap       30.000 GB   16 CPUs  (read-to-haplotype assignment)
-- Grid:  cormhap   60.000 GB   16 CPUs  (overlap detection with mhap)
-- Grid:  obtovl    30.000 GB    8 CPUs  (overlap detection)
-- Grid:  utgovl    30.000 GB    8 CPUs  (overlap detection)
-- Grid:  cor       30.000 GB    6 CPUs  (read correction)
-- Grid:  ovb       30.000 GB    6 CPUs  (overlap store bucketizer)
-- Grid:  ovs       32.000 GB    6 CPUs  (overlap store sorting)
-- Grid:  red       31.000 GB    8 CPUs  (read error detection)
-- Grid:  oea       30.000 GB    6 CPUs  (overlap error adjustment)
-- Grid:  bat      256.000 GB   16 CPUs  (contig construction with bogart)
-- Grid:  cns       30.000 GB    8 CPUs  (consensus)
--
-- In 'veletis.seqStore', found PacBio CLR reads:
--   PacBio CLR:               1
--
--   Corrected:                1
--   Corrected and Trimmed:    1
--
-- Generating assembly 'veletis' in '/scratch/arteen/arteen/veletis/01.canu/assemble/out':
--    - assemble corrected and trimmed reads.
--
-- Parameters:
--
--  genomeSize        1960000000
--
--  Overlap Generation Limits:
--    corOvlErrorRate 0.2400 ( 24.00%)
--    obtOvlErrorRate 0.0450 (  4.50%)
--    utgOvlErrorRate 0.0450 (  4.50%)
--
--  Overlap Processing Limits:
--    corErrorRate    0.3000 ( 30.00%)
--    obtErrorRate    0.0450 (  4.50%)
--    utgErrorRate    0.0450 (  4.50%)
--    cnsErrorRate    0.0750 (  7.50%)
--
--
-- BEGIN ASSEMBLY
--
--
-- Kmer counting (meryl-count) jobs failed, tried 2 times, giving up.
--   job veletis.01.meryl FAILED.
--   job veletis.02.meryl FAILED.
--   job veletis.03.meryl FAILED.
--   job veletis.04.meryl FAILED.
--

ABORT:
ABORT: canu 2.1.1
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting.  If that doesn't work, ask for help.
ABORT:

and the output from meryl-count.*.01.out:

Found perl:
   /cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/perl/5.30.2/bin/perl
   This is perl 5, version 30, subversion 2 (v5.30.2) built for x86_64-linux-thread-multi

Found java:
   /cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/bin/java
   openjdk version "13.0.2" 2020-01-14

Found canu:
   /cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/canu/2.1.1/bin/canu
   canu 2.1.1

Running job 1 based on SLURM_ARRAY_TASK_ID=1 and offset=0.
merylInput-- segment 1/4 picked reads 1-161231 out of 643833

Found 1 command tree.

Counting 4162 (estimated) million canonical 22-mers from 1 input file:
    canu-seqStore: ../../veletis.seqStore

SIMPLE MODE
-----------

  Not possible.

COMPLEX MODE
------------

prefix     # of   struct   kmers/    segs/      min     data    total
  bits   prefix   memory   prefix   prefix   memory   memory   memory
------  -------  -------  -------  -------  -------  -------  -------
     1     2  P  2803 kB  2081 MM   174 kS   128 kB    21 GB    21 GB
     2     4  P  2744 kB  1040 MM    85 kS   256 kB    21 GB    21 GB
     3     8  P  2692 kB   520 MM    41 kS   512 kB    20 GB    20 GB
     4    16  P  2652 kB   260 MM    20 kS  1024 kB    20 GB    20 GB
     5    32  P  2637 kB   130 MM     9 kS  2048 kB    19 GB    19 GB
     6    64  P  2674 kB    65 MM  4944  S  4096 kB    19 GB    19 GB
     7   128  P  2811 kB    32 MM  2407  S  8192 kB    18 GB    18 GB
     8   256  P  3150 kB    16 MM  1171  S    16 MB    18 GB    18 GB
     9   512  P  3896 kB  8325 kM   570  S    32 MB    17 GB    17 GB
    10  1024  P  5448 kB  4162 kM   277  S    64 MB    17 GB    17 GB
    11  2048  P  8624 kB  2081 kM   135  S   128 MB    16 GB    16 GB
    12  4096  P    14 MB  1040 kM    66  S   256 MB    16 GB    16 GB
    13  8192  P    27 MB   520 kM    32  S   512 MB    16 GB    16 GB  Best Value!
    14    16 kP    52 MB   260 kM    16  S  1024 MB    16 GB    16 GB
    15    32 kP   103 MB   130 kM     8  S  2048 MB    16 GB    16 GB
    16    64 kP   204 MB    65 kM     4  S  4096 MB    16 GB    16 GB
    17   128 kP   406 MB    32 kM     2  S  8192 MB    16 GB    16 GB
    18   256 kP   810 MB    16 kM     1  S    16 GB    16 GB    16 GB
    19   512 kP  1620 MB  8326  M     1  S    32 GB    32 GB    33 GB
    20  1024 kP  3240 MB  4163  M     1  S    64 GB    64 GB    67 GB
    21  2048 kP  6480 MB  2082  M     1  S   128 GB   128 GB   134 GB
    22  4096 kP    12 GB  1041  M     1  S   256 GB   256 GB   268 GB

FINAL CONFIGURATION
-------------------

WARNING:
WARNING: Cannot fit into 18 GB memory limit.
WARNING: Will split into up to 2 (possibly 3) batches, and merge them at the end.
WARNING:

Configured complex mode for 16.027 GB memory per batch, and up to 2 batches.

Start counting with THREADED method.
Used 0.537 GB out of 18.000 GB to store      1047761 kmers.
Used 0.663 GB out of 18.000 GB to store     29335555 kmers.
Used 0.790 GB out of 18.000 GB to store     64957757 kmers.
Used 0.917 GB out of 18.000 GB to store    100579365 kmers.
Used 1.044 GB out of 18.000 GB to store    136201610 kmers.
Used 1.172 GB out of 18.000 GB to store    171823045 kmers.
Used 1.300 GB out of 18.000 GB to store    207445348 kmers.
Used 1.427 GB out of 18.000 GB to store    243066230 kmers.
Used 1.555 GB out of 18.000 GB to store    278688335 kmers.
Used 1.683 GB out of 18.000 GB to store    314309237 kmers.
Used 1.812 GB out of 18.000 GB to store    349932011 kmers.
Used 1.940 GB out of 18.000 GB to store    385554125 kmers.
Used 2.069 GB out of 18.000 GB to store    421175711 kmers.
Used 2.197 GB out of 18.000 GB to store    456797855 kmers.
Used 2.326 GB out of 18.000 GB to store    492420321 kmers.
Used 2.454 GB out of 18.000 GB to store    528041588 kmers.
Used 2.583 GB out of 18.000 GB to store    563663834 kmers.
Used 2.711 GB out of 18.000 GB to store    599284804 kmers.
Used 2.840 GB out of 18.000 GB to store    634907291 kmers.
Used 2.968 GB out of 18.000 GB to store    670529053 kmers.
Used 3.093 GB out of 18.000 GB to store    705103146 kmers.
Used 3.222 GB out of 18.000 GB to store    740724742 kmers.
Used 3.351 GB out of 18.000 GB to store    776345932 kmers.
Used 3.479 GB out of 18.000 GB to store    811968030 kmers.
Used 3.608 GB out of 18.000 GB to store    847590056 kmers.
Used 3.736 GB out of 18.000 GB to store    883211692 kmers.
Used 3.865 GB out of 18.000 GB to store    918833594 kmers.
Used 3.994 GB out of 18.000 GB to store    954456302 kmers.
Used 4.122 GB out of 18.000 GB to store    990078658 kmers.
Used 4.251 GB out of 18.000 GB to store   1025701014 kmers.
Used 4.380 GB out of 18.000 GB to store   1061322578 kmers.
Used 4.508 GB out of 18.000 GB to store   1096944171 kmers.
Used 4.636 GB out of 18.000 GB to store   1132567087 kmers.
Used 4.765 GB out of 18.000 GB to store   1168189685 kmers.
Used 4.894 GB out of 18.000 GB to store   1203811513 kmers.
Used 5.022 GB out of 18.000 GB to store   1239433583 kmers.
Used 5.147 GB out of 18.000 GB to store   1274008147 kmers.
Used 5.276 GB out of 18.000 GB to store   1309630763 kmers.
Used 5.404 GB out of 18.000 GB to store   1345253075 kmers.
Used 5.533 GB out of 18.000 GB to store   1380875245 kmers.
Used 5.661 GB out of 18.000 GB to store   1416497091 kmers.
Used 5.787 GB out of 18.000 GB to store   1451071053 kmers.
Used 5.912 GB out of 18.000 GB to store   1485645538 kmers.
Used 6.040 GB out of 18.000 GB to store   1521266696 kmers.
Used 6.169 GB out of 18.000 GB to store   1556888898 kmers.
Used 6.297 GB out of 18.000 GB to store   1592511468 kmers.
Used 6.426 GB out of 18.000 GB to store   1628133890 kmers.
Used 6.555 GB out of 18.000 GB to store   1663755869 kmers.
Used 6.683 GB out of 18.000 GB to store   1699377631 kmers.
Used 6.812 GB out of 18.000 GB to store   1734999285 kmers.
Used 6.941 GB out of 18.000 GB to store   1770620937 kmers.
Used 7.066 GB out of 18.000 GB to store   1805196170 kmers.
Used 7.194 GB out of 18.000 GB to store   1840817822 kmers.
Used 7.323 GB out of 18.000 GB to store   1876439420 kmers.
Used 7.451 GB out of 18.000 GB to store   1912060106 kmers.
Used 7.580 GB out of 18.000 GB to store   1947681980 kmers.
Used 7.709 GB out of 18.000 GB to store   1983303226 kmers.
Used 7.837 GB out of 18.000 GB to store   2018925956 kmers.
Used 7.966 GB out of 18.000 GB to store   2054547498 kmers.
Used 8.091 GB out of 18.000 GB to store   2089122015 kmers.
Used 8.219 GB out of 18.000 GB to store   2124743544 kmers.
Used 8.344 GB out of 18.000 GB to store   2159316927 kmers.
Used 8.473 GB out of 18.000 GB to store   2194938755 kmers.
Used 8.602 GB out of 18.000 GB to store   2230560847 kmers.
Used 8.730 GB out of 18.000 GB to store   2266182086 kmers.
Used 8.859 GB out of 18.000 GB to store   2301803650 kmers.
Used 8.984 GB out of 18.000 GB to store   2336377518 kmers.
Used 9.112 GB out of 18.000 GB to store   2371998573 kmers.
Used 9.241 GB out of 18.000 GB to store   2407619939 kmers.
Used 9.366 GB out of 18.000 GB to store   2442193434 kmers.
Used 9.495 GB out of 18.000 GB to store   2477815545 kmers.
Used 9.623 GB out of 18.000 GB to store   2513437542 kmers.
Used 9.752 GB out of 18.000 GB to store   2549059745 kmers.
Used 9.880 GB out of 18.000 GB to store   2584681465 kmers.
Used 10.009 GB out of 18.000 GB to store   2620303688 kmers.
Used 10.138 GB out of 18.000 GB to store   2655925666 kmers.
Used 10.266 GB out of 18.000 GB to store   2691547550 kmers.
Used 10.395 GB out of 18.000 GB to store   2727168859 kmers.
Used 10.523 GB out of 18.000 GB to store   2762790247 kmers.
Used 10.652 GB out of 18.000 GB to store   2798411381 kmers.
Used 10.780 GB out of 18.000 GB to store   2834033253 kmers.
Used 10.909 GB out of 18.000 GB to store   2869654619 kmers.
Used 11.034 GB out of 18.000 GB to store   2904228862 kmers.
Used 11.159 GB out of 18.000 GB to store   2938802731 kmers.
Used 11.288 GB out of 18.000 GB to store   2974425769 kmers.
Used 11.417 GB out of 18.000 GB to store   3010047135 kmers.
Used 11.545 GB out of 18.000 GB to store   3045668875 kmers.
Used 11.674 GB out of 18.000 GB to store   3081289245 kmers.
Used 11.803 GB out of 18.000 GB to store   3116909749 kmers.
Used 11.931 GB out of 18.000 GB to store   3152531123 kmers.
Used 12.060 GB out of 18.000 GB to store   3188152599 kmers.
Used 12.188 GB out of 18.000 GB to store   3223774295 kmers.
Used 12.317 GB out of 18.000 GB to store   3259396387 kmers.
Used 12.445 GB out of 18.000 GB to store   3295016941 kmers.
Used 12.574 GB out of 18.000 GB to store   3330637691 kmers.
Used 12.703 GB out of 18.000 GB to store   3366260289 kmers.
Used 12.832 GB out of 18.000 GB to store   3401881348 kmers.
Used 12.960 GB out of 18.000 GB to store   3437501932 kmers.
Used 13.088 GB out of 18.000 GB to store   3473122924 kmers.
Used 13.217 GB out of 18.000 GB to store   3508743410 kmers.
Used 13.346 GB out of 18.000 GB to store   3544364857 kmers.
Used 13.471 GB out of 18.000 GB to store   3578938484 kmers.
Used 13.599 GB out of 18.000 GB to store   3614559894 kmers.
Used 13.728 GB out of 18.000 GB to store   3650181480 kmers.
Used 13.853 GB out of 18.000 GB to store   3684755877 kmers.
Used 13.981 GB out of 18.000 GB to store   3720377045 kmers.
Used 14.110 GB out of 18.000 GB to store   3755998653 kmers.
Used 14.239 GB out of 18.000 GB to store   3791619513 kmers.
Used 14.364 GB out of 18.000 GB to store   3826193954 kmers.
Used 14.493 GB out of 18.000 GB to store   3861816266 kmers.
Used 14.621 GB out of 18.000 GB to store   3897438325 kmers.
Used 14.750 GB out of 18.000 GB to store   3933059867 kmers.
Used 14.875 GB out of 18.000 GB to store   3967632485 kmers.
Used 15.004 GB out of 18.000 GB to store   4003253675 kmers.
Used 15.132 GB out of 18.000 GB to store   4038875525 kmers.
Used 15.257 GB out of 18.000 GB to store   4073447685 kmers.
Used 15.386 GB out of 18.000 GB to store   4109067995 kmers.
Used 15.514 GB out of 18.000 GB to store   4144689713 kmers.
Used 15.643 GB out of 18.000 GB to store   4180310846 kmers.
Used 15.771 GB out of 18.000 GB to store   4215932388 kmers.
Used 15.900 GB out of 18.000 GB to store   4251554084 kmers.
Used 16.029 GB out of 18.000 GB to store   4287176242 kmers.
Used 16.157 GB out of 18.000 GB to store   4322797388 kmers.
Used 16.286 GB out of 18.000 GB to store   4358418945 kmers.

Writing results to './veletis.01.meryl.WORKING', using 8 threads.
/var/spool/slurmd/job64142786/slurm_script: line 97:  2612 Killed                  /cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/canu/2.1.1/bin/meryl k=22 threads=8 memory=18 count segment=$jobid/04 ../../veletis.seqStore output ./veletis.$jobid.meryl.WORKING
slurmstepd: error: Detected 1 oom-kill event(s) in StepId=64142786.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.

It seems to be a memory issue with meryl but when I added the parameter merylMemory=45 it didnt change and it also didnt change when I added merylThreads=1. Is there another way to increase the memory that the meryl-count can receive?

Also my full canu command is:

canu -p veletis -d out corMhapFilterThreshold=0.0000000002 corMhapOptions="--threshold 0.80 --min-olap-length 10000 --num-hashes 256 --num-min-matches 3 --ordered-sketch-size 1000 --ordered-kmer-size 16  --repeat-idf-scale 50" mhapMemory=60g mhapBlockSize=500 ovlMerDistinct=0.975  gridOptions="--time=12:00:00" gridEngineArrayOption="-a ARRAY_JOBS%175" merylMemory=45 merylThreads=1 minMemory=30 minThreads=6 maxThreads=16 genomeSize=1.96g -trimmed -corrected -pacbio ~/scratch/arteen/veletis/01.canu/out/veletis.correctedReads.fasta.gz
brianwalenz commented 3 years ago

You want to tell meryl (the program) to use, say, 16GB memory, but tell the grid that it needs 24GB memory:

merylMemory=16
merylThreads=8
gridOptionsMeryl="--mem-per-cpu=3g"

Check the existing *jobSubmit*sh scripts in the meryl directory for exactly how to specify memory. It should be of the form --cpus-per-task=THREADS --mem-per-cpu=MEMORY, and total memory available for the job would then be THREADS * MEMORY.

skoren commented 3 years ago

Idle, no response