marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
660 stars 179 forks source link

ctgStore not created in a subset of samples #2284

Closed TMAdams closed 10 months ago

TMAdams commented 10 months ago

canu command run:

canu -d assembly/R5_Diff -p R5_Diff -pacbio-hifi trimmed_reads/R5_Diff.fq useGrid=false genomeSize=3000000 maxInputCoverage=20000

output of canu -version:

canu 2.2

System: Cluster running Rocky Linux 8.9

Issue detail: Dear Canu devlopers,

I've been assembling a collection of HiFi reads for 130 samples. What I'm finding is an unexpected failure in three of my samples. These are from a latest batch of sequencing that produced larger input files than previous runs (~3TB when gzipped). These all seem to be failing at the same point, specifically it looks like the ctgStore isn't created, though I'm not clear why. Interestingly, other samples from this run that are 2.6 and 2.7 GB when gzipped.

Would greatly appreciate any help, I've tried varying the maxinputcoverage and genomesize parameters, but these haven't resolved the issue. Any assistance would be greatly appreciated! Full log file pasted below.

-- canu 2.2
--
-- CITATIONS
--
-- For assemblies of PacBio HiFi reads:
--   Nurk S, Walenz BP, Rhiea A, Vollger MR, Logsdon GA, Grothe R, Miga KH, Eichler EE, Phillippy AM, Koren S.
--   HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.
--   biorXiv. 2020.
--   https://doi.org/10.1101/2020.03.14.992248
-- 
-- Read and contig alignments during correction and consensus use:
--   Šošic M, Šikic M.
--   Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
--   Bioinformatics. 2017 May 1;33(9):1394-1395.
--   http://doi.org/10.1093/bioinformatics/btw753
-- 
-- Overlaps are generated using:
--   Berlin K, et al.
--   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
--   Nat Biotechnol. 2015 Jun;33(6):623-30.
--   http://doi.org/10.1038/nbt.3238
-- 
--   Myers EW, et al.
--   A Whole-Genome Assembly of Drosophila.
--   Science. 2000 Mar 24;287(5461):2196-204.
--   http://doi.org/10.1126/science.287.5461.2196
-- 
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
--   Chin CS, et al.
--   Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
--   Nat Methods. 2013 Jun;10(6):563-9
--   http://doi.org/10.1038/nmeth.2474
-- 
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '11.0.9.1-internal' (from '/mnt/shared/scratch/tadams/smrtrenseq_assembly/.snakemake/conda/d198b113b159549338ae9bea44596cd3_/bin/java') without -d64 support.
-- Detected gnuplot version '5.4 patchlevel 3   ' (from 'gnuplot') and image format 'png'.
--
-- Detected 8 CPUs and 32000 gigabytes of memory on the local machine.
--
-- Detected Slurm with 'sinfo' binary in /opt/slurm/latest/bin/sinfo.
--          Slurm disabled by useGrid=false
--
-- Local machine mode enabled; grid support not detected or not allowed.
--
--                                (tag)Concurrency
--                         (tag)Threads          |
--                (tag)Memory         |          |
--        (tag)             |         |          |       total usage      algorithm
--        -------  ----------  --------   --------  --------------------  -----------------------------
-- Local: meryl     12.000 GB    4 CPUs x   2 jobs    24.000 GB   8 CPUs  (k-mer counting)
-- Local: hap        8.000 GB    4 CPUs x   2 jobs    16.000 GB   8 CPUs  (read-to-haplotype assignment)
-- Local: cormhap    6.000 GB    8 CPUs x   1 job      6.000 GB   8 CPUs  (overlap detection with mhap)
-- Local: obtovl     4.000 GB    8 CPUs x   1 job      4.000 GB   8 CPUs  (overlap detection)
-- Local: utgovl     4.000 GB    8 CPUs x   1 job      4.000 GB   8 CPUs  (overlap detection)
-- Local: cor        -.--- GB    4 CPUs x   - jobs     -.--- GB   - CPUs  (read correction)
-- Local: ovb        4.000 GB    1 CPU  x   8 jobs    32.000 GB   8 CPUs  (overlap store bucketizer)
-- Local: ovs        8.000 GB    1 CPU  x   8 jobs    64.000 GB   8 CPUs  (overlap store sorting)
-- Local: red       16.000 GB    4 CPUs x   2 jobs    32.000 GB   8 CPUs  (read error detection)
-- Local: oea        8.000 GB    1 CPU  x   8 jobs    64.000 GB   8 CPUs  (overlap error adjustment)
-- Local: bat       16.000 GB    4 CPUs x   1 job     16.000 GB   4 CPUs  (contig construction with bogart)
-- Local: cns        -.--- GB    4 CPUs x   - jobs     -.--- GB   - CPUs  (consensus)
--
-- Found trimmed raw PacBio HiFi reads in the input files.
--
-- Generating assembly 'R5_Diff' in '/mnt/shared/scratch/tadams/smrtrenseq_assembly/assembly/R5_Diff':
--   genomeSize:
--     3000000
--
--   Overlap Generation Limits:
--     corOvlErrorRate 0.0000 (  0.00%)
--     obtOvlErrorRate 0.0250 (  2.50%)
--     utgOvlErrorRate 0.0100 (  1.00%)
--
--   Overlap Processing Limits:
--     corErrorRate    0.0000 (  0.00%)
--     obtErrorRate    0.0250 (  2.50%)
--     utgErrorRate    0.0003 (  0.03%)
--     cnsErrorRate    0.0500 (  5.00%)
--
--   Stages to run:
--     assemble HiFi reads.
--
--
-- Correction skipped; not enabled.
--
-- Trimming skipped; not enabled.
--
-- BEGIN ASSEMBLY
----------------------------------------
-- Starting command on Wed Jan 10 14:30:41 2024 with 333650.972 GB free disk space

    cd .
    ./R5_Diff.seqStore.sh \
    > ./R5_Diff.seqStore.err 2>&1

-- Finished on Wed Jan 10 14:35:12 2024 (271 seconds) with 333630.396 GB free disk space
----------------------------------------
--
-- In sequence store './R5_Diff.seqStore':
--   Found 2333307 reads.
--   Found 7994890136 bases (2664.96 times coverage).
--    Histogram of corrected reads:
--    
--    G=7994890136                       sum of  ||               length     num
--    NG         length     index       lengths  ||                range    seqs
--    ----- ------------ --------- ------------  ||  ------------------- -------
--    00010         4347    168646    799492570  ||       1213-1416          275|-
--    00020         4016    360574   1598979362  ||       1417-1620         4363|-
--    00030         3793    565667   2398469084  ||       1621-1824         7005|--
--    00040         3615    781716   3197957216  ||       1825-2028         8826|--
--    00050         3455   1008032   3997445645  ||       2029-2232        11190|--
--    00060         3306   1244597   4796934382  ||       2233-2436        21838|----
--    00070         3163   1491861   5596425719  ||       2437-2640        73702|--------------
--    00080         3011   1750800   6395914266  ||       2641-2844       201819|------------------------------------
--    00090         2829   2024248   7195402597  ||       2845-3048       315656|---------------------------------------------------------
--    00100         1213   2333306   7994890136  ||       3049-3252       353836|---------------------------------------------------------------
--    001.000x             2333307   7994890136  ||       3253-3456       329207|-----------------------------------------------------------
--                                               ||       3457-3660       284218|---------------------------------------------------
--                                               ||       3661-3864       228711|-----------------------------------------
--                                               ||       3865-4068       170514|-------------------------------
--                                               ||       4069-4272       119362|----------------------
--                                               ||       4273-4476        80711|---------------
--                                               ||       4477-4680        50397|---------
--                                               ||       4681-4884        30332|------
--                                               ||       4885-5088        17353|----
--                                               ||       5089-5292         9832|--
--                                               ||       5293-5496         5466|-
--                                               ||       5497-5700         3191|-
--                                               ||       5701-5904         1940|-
--                                               ||       5905-6108         1138|-
--                                               ||       6109-6312          725|-
--                                               ||       6313-6516          476|-
--                                               ||       6517-6720          315|-
--                                               ||       6721-6924          234|-
--                                               ||       6925-7128          157|-
--                                               ||       7129-7332          114|-
--                                               ||       7333-7536          114|-
--                                               ||       7537-7740           71|-
--                                               ||       7741-7944           53|-
--                                               ||       7945-8148           34|-
--                                               ||       8149-8352           25|-
--                                               ||       8353-8556           31|-
--                                               ||       8557-8760           19|-
--                                               ||       8761-8964           21|-
--                                               ||       8965-9168            7|-
--                                               ||       9169-9372            7|-
--                                               ||       9373-9576            7|-
--                                               ||       9577-9780            5|-
--                                               ||       9781-9984            0|
--                                               ||       9985-10188           4|-
--                                               ||      10189-10392           0|
--                                               ||      10393-10596           2|-
--                                               ||      10597-10800           3|-
--                                               ||      10801-11004           0|
--                                               ||      11005-11208           0|
--                                               ||      11209-11412           1|-
--    
--
-- In sequence store './R5_Diff.seqStore':
--   Found 2333307 reads.
--   Found 7994890136 bases (2664.96 times coverage).
--    Histogram of corrected-trimmed reads:
--    
--    G=7994890136                       sum of  ||               length     num
--    NG         length     index       lengths  ||                range    seqs
--    ----- ------------ --------- ------------  ||  ------------------- -------
--    00010         4347    168646    799492570  ||       1213-1416          275|-
--    00020         4016    360574   1598979362  ||       1417-1620         4363|-
--    00030         3793    565667   2398469084  ||       1621-1824         7005|--
--    00040         3615    781716   3197957216  ||       1825-2028         8826|--
--    00050         3455   1008032   3997445645  ||       2029-2232        11190|--
--    00060         3306   1244597   4796934382  ||       2233-2436        21838|----
--    00070         3163   1491861   5596425719  ||       2437-2640        73702|--------------
--    00080         3011   1750800   6395914266  ||       2641-2844       201819|------------------------------------
--    00090         2829   2024248   7195402597  ||       2845-3048       315656|---------------------------------------------------------
--    00100         1213   2333306   7994890136  ||       3049-3252       353836|---------------------------------------------------------------
--    001.000x             2333307   7994890136  ||       3253-3456       329207|-----------------------------------------------------------
--                                               ||       3457-3660       284218|---------------------------------------------------
--                                               ||       3661-3864       228711|-----------------------------------------
--                                               ||       3865-4068       170514|-------------------------------
--                                               ||       4069-4272       119362|----------------------
--                                               ||       4273-4476        80711|---------------
--                                               ||       4477-4680        50397|---------
--                                               ||       4681-4884        30332|------
--                                               ||       4885-5088        17353|----
--                                               ||       5089-5292         9832|--
--                                               ||       5293-5496         5466|-
--                                               ||       5497-5700         3191|-
--                                               ||       5701-5904         1940|-
--                                               ||       5905-6108         1138|-
--                                               ||       6109-6312          725|-
--                                               ||       6313-6516          476|-
--                                               ||       6517-6720          315|-
--                                               ||       6721-6924          234|-
--                                               ||       6925-7128          157|-
--                                               ||       7129-7332          114|-
--                                               ||       7333-7536          114|-
--                                               ||       7537-7740           71|-
--                                               ||       7741-7944           53|-
--                                               ||       7945-8148           34|-
--                                               ||       8149-8352           25|-
--                                               ||       8353-8556           31|-
--                                               ||       8557-8760           19|-
--                                               ||       8761-8964           21|-
--                                               ||       8965-9168            7|-
--                                               ||       9169-9372            7|-
--                                               ||       9373-9576            7|-
--                                               ||       9577-9780            5|-
--                                               ||       9781-9984            0|
--                                               ||       9985-10188           4|-
--                                               ||      10189-10392           0|
--                                               ||      10393-10596           2|-
--                                               ||      10597-10800           3|-
--                                               ||      10801-11004           0|
--                                               ||      11005-11208           0|
--                                               ||      11209-11412           1|-
--    
----------------------------------------
-- Starting command on Wed Jan 10 14:35:31 2024 with 333626.912 GB free disk space

    cd unitigging/0-mercounts
    ./meryl-configure.sh \
    > ./meryl-configure.err 2>&1

-- Finished on Wed Jan 10 14:35:39 2024 (8 seconds) with 333625.849 GB free disk space
----------------------------------------
--  segments   memory batches
--  -------- -------- -------
--        01  9.46 GB       3
--        02  9.46 GB       2
--        04  4.93 GB       2
--        06  3.47 GB       2
--        08  2.49 GB       2
--        12  1.84 GB       2
--        16  1.38 GB       2
--        20  1.11 GB       2
--        24  0.92 GB       2
--        32  0.69 GB       2
--        40  0.56 GB       2
--        48  0.47 GB       2
--        56  0.40 GB       2
--        64  0.35 GB       2
--
--  For 2333307 reads with 7994890136 bases, limit to 79 batches.
--  Will count kmers using 02 jobs, each using 11 GB and 4 threads.
--
-- Finished stage 'merylConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'meryl' concurrent execution on Wed Jan 10 14:35:40 2024 with 333625.849 GB free disk space (2 processes; 2 concurrently)

    cd unitigging/0-mercounts
    ./meryl-count.sh 1 > ./meryl-count.000001.out 2>&1
    ./meryl-count.sh 2 > ./meryl-count.000002.out 2>&1

-- Finished on Wed Jan 10 14:41:58 2024 (377 seconds) with 333591.04 GB free disk space
----------------------------------------
-- Found 2 Kmer counting (meryl) outputs.
-- Finished stage 'utg-merylCountCheck', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'meryl' concurrent execution on Wed Jan 10 14:41:58 2024 with 333591.04 GB free disk space (1 processes; 2 concurrently)

    cd unitigging/0-mercounts
    ./meryl-process.sh 1 > ./meryl-process.000001.out 2>&1

-- Finished on Wed Jan 10 14:42:14 2024 (16 seconds) with 333591.649 GB free disk space
----------------------------------------
-- Meryl finished successfully.  Kmer frequency histogram:
--
--  22-mers                                                                                           Fraction
--    Occurrences   NumMers                                                                         Unique Total
--       1-     1         0                                                                        0.0000 0.0000
--       2-     2  35524773 *********************************************************************  0.2376 0.0131
--       3-     4  35759613 ********************************************************************** 0.3818 0.0250
--       5-     7  23648962 **********************************************                         0.5441 0.0448
--       8-    11  14737520 ****************************                                           0.6665 0.0679
--      12-    16   9565518 ******************                                                     0.7497 0.0915
--      17-    22   6283676 ************                                                           0.8062 0.1145
--      23-    29   4255993 ********                                                               0.8444 0.1358
--      30-    37   2973406 *****                                                                  0.8708 0.1553
--      38-    46   2164854 ****                                                                   0.8897 0.1730
--      47-    56   1643014 ***                                                                    0.9035 0.1893
--      57-    67   1291981 **                                                                     0.9141 0.2046
--      68-    79   1050301 **                                                                     0.9225 0.2191
--      80-    92    887431 *                                                                      0.9293 0.2331
--      93-   106    768651 *                                                                      0.9352 0.2471
--     107-   121    683595 *                                                                      0.9402 0.2611
--     122-   137    626114 *                                                                      0.9448 0.2754
--     138-   154    585540 *                                                                      0.9489 0.2903
--     155-   172    573298 *                                                                      0.9528 0.3060
--     173-   191    543951 *                                                                      0.9566 0.3233
--     192-   211    534642 *                                                                      0.9602 0.3416
--     212-   232    501893                                                                        0.9638 0.3614
--     233-   254    457086                                                                        0.9671 0.3820
--     255-   277    417093                                                                        0.9702 0.4024
--     278-   301    357954                                                                        0.9729 0.4228
--     302-   326    302063                                                                        0.9753 0.4418
--     327-   352    251473                                                                        0.9773 0.4591
--     353-   379    223894                                                                        0.9790 0.4748
--     380-   407    207926                                                                        0.9805 0.4898
--     408-   436    195598                                                                        0.9819 0.5049
--     437-   466    192326                                                                        0.9832 0.5201
--     467-   497    182961                                                                        0.9845 0.5361
--     498-   529    167491                                                                        0.9857 0.5523
--     530-   562    155961                                                                        0.9868 0.5681
--     563-   596    139677                                                                        0.9878 0.5838
--     597-   631    130299                                                                        0.9888 0.5987
--     632-   667    115708                                                                        0.9896 0.6134
--     668-   704    105166                                                                        0.9904 0.6272
--     705-   742    101477                                                                        0.9911 0.6405
--     743-   781     99707                                                                        0.9918 0.6540
--     782-   821     93305                                                                        0.9924 0.6680
--
--           0 (max occurrences)
--  5424528492 (total mers, non-unique)
--   149541210 (distinct mers, non-unique)
--           0 (unique mers)
-- Finished stage 'meryl-process', reset canuIteration.
--
-- Removing meryl database 'unitigging/0-mercounts/R5_Diff.ms22'.
--
-- OVERLAPPER (normal) (assembly) erate=0.01
--
----------------------------------------
-- Starting command on Wed Jan 10 14:42:15 2024 with 333591.649 GB free disk space

    cd unitigging/1-overlapper
    /mnt/shared/scratch/tadams/smrtrenseq_assembly/.snakemake/conda/d198b113b159549338ae9bea44596cd3_/bin/overlapInCorePartition \
     -S  ../../R5_Diff.seqStore \
     -hl 80000000 \
     -rl 1000000000 \
     -ol 500 \
     -o  ./R5_Diff.partition \
    > ./R5_Diff.partition.err 2>&1

-- Finished on Wed Jan 10 14:42:17 2024 (2 seconds) with 333591.649 GB free disk space
----------------------------------------
--
-- Configured 234 overlapInCore jobs.
-- Finished stage 'utg-overlapConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'utgovl' concurrent execution on Wed Jan 10 14:42:17 2024 with 333591.649 GB free disk space (234 processes; 1 concurrently)

    cd unitigging/1-overlapper
    ./overlap.sh 1 > ./overlap.000001.out 2>&1
    ./overlap.sh 2 > ./overlap.000002.out 2>&1
    ./overlap.sh 3 > ./overlap.000003.out 2>&1
    ./overlap.sh 4 > ./overlap.000004.out 2>&1
    ./overlap.sh 5 > ./overlap.000005.out 2>&1
    ./overlap.sh 6 > ./overlap.000006.out 2>&1
    ./overlap.sh 7 > ./overlap.000007.out 2>&1
    ./overlap.sh 8 > ./overlap.000008.out 2>&1
    ./overlap.sh 9 > ./overlap.000009.out 2>&1
    ./overlap.sh 10 > ./overlap.000010.out 2>&1
    ./overlap.sh 11 > ./overlap.000011.out 2>&1
    ./overlap.sh 12 > ./overlap.000012.out 2>&1
    ./overlap.sh 13 > ./overlap.000013.out 2>&1
    ./overlap.sh 14 > ./overlap.000014.out 2>&1
    ./overlap.sh 15 > ./overlap.000015.out 2>&1
    ./overlap.sh 16 > ./overlap.000016.out 2>&1
    ./overlap.sh 17 > ./overlap.000017.out 2>&1
    ./overlap.sh 18 > ./overlap.000018.out 2>&1
    ./overlap.sh 19 > ./overlap.000019.out 2>&1
    ./overlap.sh 20 > ./overlap.000020.out 2>&1
    ./overlap.sh 21 > ./overlap.000021.out 2>&1
    ./overlap.sh 22 > ./overlap.000022.out 2>&1
    ./overlap.sh 23 > ./overlap.000023.out 2>&1
    ./overlap.sh 24 > ./overlap.000024.out 2>&1
    ./overlap.sh 25 > ./overlap.000025.out 2>&1
    ./overlap.sh 26 > ./overlap.000026.out 2>&1
    ./overlap.sh 27 > ./overlap.000027.out 2>&1
    ./overlap.sh 28 > ./overlap.000028.out 2>&1
    ./overlap.sh 29 > ./overlap.000029.out 2>&1
    ./overlap.sh 30 > ./overlap.000030.out 2>&1
    ./overlap.sh 31 > ./overlap.000031.out 2>&1
    ./overlap.sh 32 > ./overlap.000032.out 2>&1
    ./overlap.sh 33 > ./overlap.000033.out 2>&1
    ./overlap.sh 34 > ./overlap.000034.out 2>&1
    ./overlap.sh 35 > ./overlap.000035.out 2>&1
    ./overlap.sh 36 > ./overlap.000036.out 2>&1
    ./overlap.sh 37 > ./overlap.000037.out 2>&1
    ./overlap.sh 38 > ./overlap.000038.out 2>&1
    ./overlap.sh 39 > ./overlap.000039.out 2>&1
    ./overlap.sh 40 > ./overlap.000040.out 2>&1
    ./overlap.sh 41 > ./overlap.000041.out 2>&1
    ./overlap.sh 42 > ./overlap.000042.out 2>&1
    ./overlap.sh 43 > ./overlap.000043.out 2>&1
    ./overlap.sh 44 > ./overlap.000044.out 2>&1
    ./overlap.sh 45 > ./overlap.000045.out 2>&1
    ./overlap.sh 46 > ./overlap.000046.out 2>&1
    ./overlap.sh 47 > ./overlap.000047.out 2>&1
    ./overlap.sh 48 > ./overlap.000048.out 2>&1
    ./overlap.sh 49 > ./overlap.000049.out 2>&1
    ./overlap.sh 50 > ./overlap.000050.out 2>&1
    ./overlap.sh 51 > ./overlap.000051.out 2>&1
    ./overlap.sh 52 > ./overlap.000052.out 2>&1
    ./overlap.sh 53 > ./overlap.000053.out 2>&1
    ./overlap.sh 54 > ./overlap.000054.out 2>&1
    ./overlap.sh 55 > ./overlap.000055.out 2>&1
    ./overlap.sh 56 > ./overlap.000056.out 2>&1
    ./overlap.sh 57 > ./overlap.000057.out 2>&1
    ./overlap.sh 58 > ./overlap.000058.out 2>&1
    ./overlap.sh 59 > ./overlap.000059.out 2>&1
    ./overlap.sh 60 > ./overlap.000060.out 2>&1
    ./overlap.sh 61 > ./overlap.000061.out 2>&1
    ./overlap.sh 62 > ./overlap.000062.out 2>&1
    ./overlap.sh 63 > ./overlap.000063.out 2>&1
    ./overlap.sh 64 > ./overlap.000064.out 2>&1
    ./overlap.sh 65 > ./overlap.000065.out 2>&1
    ./overlap.sh 66 > ./overlap.000066.out 2>&1
    ./overlap.sh 67 > ./overlap.000067.out 2>&1
    ./overlap.sh 68 > ./overlap.000068.out 2>&1
    ./overlap.sh 69 > ./overlap.000069.out 2>&1
    ./overlap.sh 70 > ./overlap.000070.out 2>&1
    ./overlap.sh 71 > ./overlap.000071.out 2>&1
    ./overlap.sh 72 > ./overlap.000072.out 2>&1
    ./overlap.sh 73 > ./overlap.000073.out 2>&1
    ./overlap.sh 74 > ./overlap.000074.out 2>&1
    ./overlap.sh 75 > ./overlap.000075.out 2>&1
    ./overlap.sh 76 > ./overlap.000076.out 2>&1
    ./overlap.sh 77 > ./overlap.000077.out 2>&1
    ./overlap.sh 78 > ./overlap.000078.out 2>&1
    ./overlap.sh 79 > ./overlap.000079.out 2>&1
    ./overlap.sh 80 > ./overlap.000080.out 2>&1
    ./overlap.sh 81 > ./overlap.000081.out 2>&1
    ./overlap.sh 82 > ./overlap.000082.out 2>&1
    ./overlap.sh 83 > ./overlap.000083.out 2>&1
    ./overlap.sh 84 > ./overlap.000084.out 2>&1
    ./overlap.sh 85 > ./overlap.000085.out 2>&1
    ./overlap.sh 86 > ./overlap.000086.out 2>&1
    ./overlap.sh 87 > ./overlap.000087.out 2>&1
    ./overlap.sh 88 > ./overlap.000088.out 2>&1
    ./overlap.sh 89 > ./overlap.000089.out 2>&1
    ./overlap.sh 90 > ./overlap.000090.out 2>&1
    ./overlap.sh 91 > ./overlap.000091.out 2>&1
    ./overlap.sh 92 > ./overlap.000092.out 2>&1
    ./overlap.sh 93 > ./overlap.000093.out 2>&1
    ./overlap.sh 94 > ./overlap.000094.out 2>&1
    ./overlap.sh 95 > ./overlap.000095.out 2>&1
    ./overlap.sh 96 > ./overlap.000096.out 2>&1
    ./overlap.sh 97 > ./overlap.000097.out 2>&1
    ./overlap.sh 98 > ./overlap.000098.out 2>&1
    ./overlap.sh 99 > ./overlap.000099.out 2>&1
    ./overlap.sh 100 > ./overlap.000100.out 2>&1
    ./overlap.sh 101 > ./overlap.000101.out 2>&1
    ./overlap.sh 102 > ./overlap.000102.out 2>&1
    ./overlap.sh 103 > ./overlap.000103.out 2>&1
    ./overlap.sh 104 > ./overlap.000104.out 2>&1
    ./overlap.sh 105 > ./overlap.000105.out 2>&1
    ./overlap.sh 106 > ./overlap.000106.out 2>&1
    ./overlap.sh 107 > ./overlap.000107.out 2>&1
    ./overlap.sh 108 > ./overlap.000108.out 2>&1
    ./overlap.sh 109 > ./overlap.000109.out 2>&1
    ./overlap.sh 110 > ./overlap.000110.out 2>&1
    ./overlap.sh 111 > ./overlap.000111.out 2>&1
    ./overlap.sh 112 > ./overlap.000112.out 2>&1
    ./overlap.sh 113 > ./overlap.000113.out 2>&1
    ./overlap.sh 114 > ./overlap.000114.out 2>&1
    ./overlap.sh 115 > ./overlap.000115.out 2>&1
    ./overlap.sh 116 > ./overlap.000116.out 2>&1
    ./overlap.sh 117 > ./overlap.000117.out 2>&1
    ./overlap.sh 118 > ./overlap.000118.out 2>&1
    ./overlap.sh 119 > ./overlap.000119.out 2>&1
    ./overlap.sh 120 > ./overlap.000120.out 2>&1
    ./overlap.sh 121 > ./overlap.000121.out 2>&1
    ./overlap.sh 122 > ./overlap.000122.out 2>&1
    ./overlap.sh 123 > ./overlap.000123.out 2>&1
    ./overlap.sh 124 > ./overlap.000124.out 2>&1
    ./overlap.sh 125 > ./overlap.000125.out 2>&1
    ./overlap.sh 126 > ./overlap.000126.out 2>&1
    ./overlap.sh 127 > ./overlap.000127.out 2>&1
    ./overlap.sh 128 > ./overlap.000128.out 2>&1
    ./overlap.sh 129 > ./overlap.000129.out 2>&1
    ./overlap.sh 130 > ./overlap.000130.out 2>&1
    ./overlap.sh 131 > ./overlap.000131.out 2>&1
    ./overlap.sh 132 > ./overlap.000132.out 2>&1
    ./overlap.sh 133 > ./overlap.000133.out 2>&1
    ./overlap.sh 134 > ./overlap.000134.out 2>&1
    ./overlap.sh 135 > ./overlap.000135.out 2>&1
    ./overlap.sh 136 > ./overlap.000136.out 2>&1
    ./overlap.sh 137 > ./overlap.000137.out 2>&1
    ./overlap.sh 138 > ./overlap.000138.out 2>&1
    ./overlap.sh 139 > ./overlap.000139.out 2>&1
    ./overlap.sh 140 > ./overlap.000140.out 2>&1
    ./overlap.sh 141 > ./overlap.000141.out 2>&1
    ./overlap.sh 142 > ./overlap.000142.out 2>&1
    ./overlap.sh 143 > ./overlap.000143.out 2>&1
    ./overlap.sh 144 > ./overlap.000144.out 2>&1
    ./overlap.sh 145 > ./overlap.000145.out 2>&1
    ./overlap.sh 146 > ./overlap.000146.out 2>&1
    ./overlap.sh 147 > ./overlap.000147.out 2>&1
    ./overlap.sh 148 > ./overlap.000148.out 2>&1
    ./overlap.sh 149 > ./overlap.000149.out 2>&1
    ./overlap.sh 150 > ./overlap.000150.out 2>&1
    ./overlap.sh 151 > ./overlap.000151.out 2>&1
    ./overlap.sh 152 > ./overlap.000152.out 2>&1
    ./overlap.sh 153 > ./overlap.000153.out 2>&1
    ./overlap.sh 154 > ./overlap.000154.out 2>&1
    ./overlap.sh 155 > ./overlap.000155.out 2>&1
    ./overlap.sh 156 > ./overlap.000156.out 2>&1
    ./overlap.sh 157 > ./overlap.000157.out 2>&1
    ./overlap.sh 158 > ./overlap.000158.out 2>&1
    ./overlap.sh 159 > ./overlap.000159.out 2>&1
    ./overlap.sh 160 > ./overlap.000160.out 2>&1
    ./overlap.sh 161 > ./overlap.000161.out 2>&1
    ./overlap.sh 162 > ./overlap.000162.out 2>&1
    ./overlap.sh 163 > ./overlap.000163.out 2>&1
    ./overlap.sh 164 > ./overlap.000164.out 2>&1
    ./overlap.sh 165 > ./overlap.000165.out 2>&1
    ./overlap.sh 166 > ./overlap.000166.out 2>&1
    ./overlap.sh 167 > ./overlap.000167.out 2>&1
    ./overlap.sh 168 > ./overlap.000168.out 2>&1
    ./overlap.sh 169 > ./overlap.000169.out 2>&1
    ./overlap.sh 170 > ./overlap.000170.out 2>&1
    ./overlap.sh 171 > ./overlap.000171.out 2>&1
    ./overlap.sh 172 > ./overlap.000172.out 2>&1
    ./overlap.sh 173 > ./overlap.000173.out 2>&1
    ./overlap.sh 174 > ./overlap.000174.out 2>&1
    ./overlap.sh 175 > ./overlap.000175.out 2>&1
    ./overlap.sh 176 > ./overlap.000176.out 2>&1
    ./overlap.sh 177 > ./overlap.000177.out 2>&1
    ./overlap.sh 178 > ./overlap.000178.out 2>&1
    ./overlap.sh 179 > ./overlap.000179.out 2>&1
    ./overlap.sh 180 > ./overlap.000180.out 2>&1
    ./overlap.sh 181 > ./overlap.000181.out 2>&1
    ./overlap.sh 182 > ./overlap.000182.out 2>&1
    ./overlap.sh 183 > ./overlap.000183.out 2>&1
    ./overlap.sh 184 > ./overlap.000184.out 2>&1
    ./overlap.sh 185 > ./overlap.000185.out 2>&1
    ./overlap.sh 186 > ./overlap.000186.out 2>&1
    ./overlap.sh 187 > ./overlap.000187.out 2>&1
    ./overlap.sh 188 > ./overlap.000188.out 2>&1
    ./overlap.sh 189 > ./overlap.000189.out 2>&1
    ./overlap.sh 190 > ./overlap.000190.out 2>&1
    ./overlap.sh 191 > ./overlap.000191.out 2>&1
    ./overlap.sh 192 > ./overlap.000192.out 2>&1
    ./overlap.sh 193 > ./overlap.000193.out 2>&1
    ./overlap.sh 194 > ./overlap.000194.out 2>&1
    ./overlap.sh 195 > ./overlap.000195.out 2>&1
    ./overlap.sh 196 > ./overlap.000196.out 2>&1
    ./overlap.sh 197 > ./overlap.000197.out 2>&1
    ./overlap.sh 198 > ./overlap.000198.out 2>&1
    ./overlap.sh 199 > ./overlap.000199.out 2>&1
    ./overlap.sh 200 > ./overlap.000200.out 2>&1
    ./overlap.sh 201 > ./overlap.000201.out 2>&1
    ./overlap.sh 202 > ./overlap.000202.out 2>&1
    ./overlap.sh 203 > ./overlap.000203.out 2>&1
    ./overlap.sh 204 > ./overlap.000204.out 2>&1
    ./overlap.sh 205 > ./overlap.000205.out 2>&1
    ./overlap.sh 206 > ./overlap.000206.out 2>&1
    ./overlap.sh 207 > ./overlap.000207.out 2>&1
    ./overlap.sh 208 > ./overlap.000208.out 2>&1
    ./overlap.sh 209 > ./overlap.000209.out 2>&1
    ./overlap.sh 210 > ./overlap.000210.out 2>&1
    ./overlap.sh 211 > ./overlap.000211.out 2>&1
    ./overlap.sh 212 > ./overlap.000212.out 2>&1
    ./overlap.sh 213 > ./overlap.000213.out 2>&1
    ./overlap.sh 214 > ./overlap.000214.out 2>&1
    ./overlap.sh 215 > ./overlap.000215.out 2>&1
    ./overlap.sh 216 > ./overlap.000216.out 2>&1
    ./overlap.sh 217 > ./overlap.000217.out 2>&1
    ./overlap.sh 218 > ./overlap.000218.out 2>&1
    ./overlap.sh 219 > ./overlap.000219.out 2>&1
    ./overlap.sh 220 > ./overlap.000220.out 2>&1
    ./overlap.sh 221 > ./overlap.000221.out 2>&1
    ./overlap.sh 222 > ./overlap.000222.out 2>&1
    ./overlap.sh 223 > ./overlap.000223.out 2>&1
    ./overlap.sh 224 > ./overlap.000224.out 2>&1
    ./overlap.sh 225 > ./overlap.000225.out 2>&1
    ./overlap.sh 226 > ./overlap.000226.out 2>&1
    ./overlap.sh 227 > ./overlap.000227.out 2>&1
    ./overlap.sh 228 > ./overlap.000228.out 2>&1
    ./overlap.sh 229 > ./overlap.000229.out 2>&1
    ./overlap.sh 230 > ./overlap.000230.out 2>&1
    ./overlap.sh 231 > ./overlap.000231.out 2>&1
    ./overlap.sh 232 > ./overlap.000232.out 2>&1
    ./overlap.sh 233 > ./overlap.000233.out 2>&1
    ./overlap.sh 234 > ./overlap.000234.out 2>&1

-- Finished on Thu Jan 11 13:09:24 2024 (80827 seconds, no bitcoins found either) with 332158.265 GB free disk space
----------------------------------------
-- Found 234 overlapInCore output files.
--
-- overlapInCore compute 'unitigging/1-overlapper':
--   kmer hits
--     with no overlap      45202412601  703.423077 +- 70517893.258
--     with an overlap        450843846  83.1025641 +- 713481.312
--
--   overlaps                 450843846  83.1025641 +- 713481.312
--     contained              104855967  .423076923 +- 167009.365
--     dovetail               345987879  0.67948718 +- 546890.625
--
--   overlaps rejected
--     multiple per pair              0           0 +- 0
--     bad short window               0           0 +- 0
--     bad long window                0           0 +- 0
-- Finished stage 'utg-overlapCheck', reset canuIteration.
----------------------------------------
-- Starting command on Thu Jan 11 13:09:28 2024 with 332158.265 GB free disk space

    cd unitigging
    /mnt/shared/scratch/tadams/smrtrenseq_assembly/.snakemake/conda/d198b113b159549338ae9bea44596cd3_/bin/ovStoreConfig \
     -S ../R5_Diff.seqStore \
     -M 4-8 \
     -L ./1-overlapper/ovljob.files \
     -create ./R5_Diff.ovlStore.config \
     > ./R5_Diff.ovlStore.config.txt \
    2> ./R5_Diff.ovlStore.config.err

-- Finished on Thu Jan 11 13:10:02 2024 (34 seconds) with 332154.583 GB free disk space
----------------------------------------
--
-- Creating overlap store unitigging/R5_Diff.ovlStore using:
--      4 buckets
--      4 slices
--        using at most 8 GB memory each
-- Finished stage 'utg-overlapStoreConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'ovB' concurrent execution on Thu Jan 11 13:10:02 2024 with 332154.583 GB free disk space (4 processes; 8 concurrently)

    cd unitigging/R5_Diff.ovlStore.BUILDING
    ./scripts/1-bucketize.sh 1 > ./logs/1-bucketize.000001.out 2>&1
    ./scripts/1-bucketize.sh 2 > ./logs/1-bucketize.000002.out 2>&1
    ./scripts/1-bucketize.sh 3 > ./logs/1-bucketize.000003.out 2>&1
    ./scripts/1-bucketize.sh 4 > ./logs/1-bucketize.000004.out 2>&1

-- Finished on Thu Jan 11 13:11:41 2024 (99 seconds) with 332132.263 GB free disk space
----------------------------------------
-- Overlap store bucketizer finished.
-- Finished stage 'utg-overlapStoreBucketizerCheck', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'ovS' concurrent execution on Thu Jan 11 13:11:41 2024 with 332132.263 GB free disk space (4 processes; 8 concurrently)

    cd unitigging/R5_Diff.ovlStore.BUILDING
    ./scripts/2-sort.sh 1 > ./logs/2-sort.000001.out 2>&1
    ./scripts/2-sort.sh 2 > ./logs/2-sort.000002.out 2>&1
    ./scripts/2-sort.sh 3 > ./logs/2-sort.000003.out 2>&1
    ./scripts/2-sort.sh 4 > ./logs/2-sort.000004.out 2>&1

-- Finished on Thu Jan 11 13:13:39 2024 (118 seconds) with 332121.307 GB free disk space
----------------------------------------
-- Overlap store sorter finished.
-- Finished stage 'utg-overlapStoreSorterCheck', reset canuIteration.
----------------------------------------
-- Starting command on Thu Jan 11 13:13:39 2024 with 332121.307 GB free disk space

    cd unitigging
    /mnt/shared/scratch/tadams/smrtrenseq_assembly/.snakemake/conda/d198b113b159549338ae9bea44596cd3_/bin/ovStoreIndexer \
      -O  ./R5_Diff.ovlStore.BUILDING \
      -S ../R5_Diff.seqStore \
      -C  ./R5_Diff.ovlStore.config \
      -delete \
    > ./R5_Diff.ovlStore.BUILDING.index.err 2>&1

-- Finished on Thu Jan 11 13:13:41 2024 (2 seconds) with 332121.307 GB free disk space
----------------------------------------
-- Overlap store indexer finished.
-- Checking store.
----------------------------------------
-- Starting command on Thu Jan 11 13:13:41 2024 with 332121.307 GB free disk space

    cd unitigging
    /mnt/shared/scratch/tadams/smrtrenseq_assembly/.snakemake/conda/d198b113b159549338ae9bea44596cd3_/bin/ovStoreDump \
     -S ../R5_Diff.seqStore \
     -O  ./R5_Diff.ovlStore \
     -counts \
     > ./R5_Diff.ovlStore/counts.dat 2> ./R5_Diff.ovlStore/counts.err

-- Finished on Thu Jan 11 13:13:41 2024 (like a bat out of hell) with 332121.307 GB free disk space
----------------------------------------
--
-- Overlap store 'unitigging/R5_Diff.ovlStore' successfully constructed.
-- Found 901687692 overlaps for 2296996 reads; 36311 reads have no overlaps.
--
--
-- Purged 9.719 GB in 702 overlap output files.
----------------------------------------
-- Starting command on Thu Jan 11 13:13:44 2024 with 332121.307 GB free disk space

    cd unitigging
    /mnt/shared/scratch/tadams/smrtrenseq_assembly/.snakemake/conda/d198b113b159549338ae9bea44596cd3_/bin/ovStoreStats \
     -C 2664.96 \
     -S ../R5_Diff.seqStore \
     -O  ./R5_Diff.ovlStore \
     -o  ./R5_Diff.ovlStore \
     > ./R5_Diff.ovlStore.summary.err 2>&1

-- Finished on Thu Jan 11 13:17:18 2024 (214 seconds) with 332116.845 GB free disk space
----------------------------------------
--
-- Overlap store 'unitigging/R5_Diff.ovlStore' contains:
--
--   category            reads     %          read length        feature size or coverage  analysis
--   ----------------  -------  -------  ----------------------  ------------------------  --------------------
--   middle-missing       3609    0.15     2999.50 +- 989.76         320.68 +- 340.63     (bad trimming)
--   middle-hump          7147    0.31     2633.04 +- 518.07         387.16 +- 438.26     (bad trimming)
--   no-5-prime          50551    2.17     2421.89 +- 534.60         452.36 +- 519.29     (bad trimming)
--   no-3-prime          51118    2.19     2427.17 +- 530.15         455.15 +- 522.32     (bad trimming)
--   
--   low-coverage      2039383   87.40     2380.77 +- 413.00         208.23 +- 182.19     (easy to assemble, potential for lower quality consensus)
--   unique              28899    1.24     2017.35 +- 340.83        2087.37 +- 869.77     (easy to assemble, perfect, yay)
--   repeat-cont            44    0.00     1063.57 +- 35.50         6610.16 +- 950.51     (potential for consensus errors, no impact on assembly)
--   repeat-dove             0    0.00        0.00 +- 0.00             0.00 +- 0.00       (hard to assemble, likely won't assemble correctly or even at all)
--   
--   span-repeat         54971    2.36     2439.22 +- 425.64         998.25 +- 615.14     (read spans a large repeat, usually easy to assemble)
--   uniq-repeat-cont    61005    2.61     2364.88 +- 372.02                              (should be uniquely placed, low potential for consensus errors, no impact on assembly)
--   uniq-repeat-dove      269    0.01     3330.01 +- 551.13                              (will end contigs, potential to misassemble)
--   uniq-anchor             0    0.00        0.00 +- 0.00             0.00 +- 0.00       (repeat read, with unique section, probable bad read)
-- Finished stage 'utg-createOverlapStore', reset canuIteration.
--
-- Loading read lengths.
-- Loading number of overlaps per read.
--
-- Configure RED for 16gb memory.
--                   Batches of at most (unlimited) reads.
--                                      500000000 bases.
--                   Expecting evidence of at most 536870912 bases per iteration.
--
--           Total                                               Reads                 Olaps Evidence
--    Job   Memory      Read Range         Reads        Bases   Memory        Olaps   Memory   Memory  (Memory in MB)
--   ---- -------- ------------------- --------- ------------ -------- ------------ -------- --------
--      1 16384.24         1-31178         31178    107969122 13180.81     11484643   131.43  1024.00
--      2 16384.03     31179-62261         31083    107943027 13177.62     11745426   134.42  1024.00
--      3 16384.31     62262-93298         31037    107915195 13174.22     12066755   138.09  1024.00
--      4 16384.14     93299-124309        31011    107911577 13173.78     12090851   138.37  1024.00
--      5 16384.19    124310-155353        31044    107917135 13174.46     12035706   137.74  1024.00
--      6 16384.41    155354-186448        31095    107908564 13173.41     12146145   139.00  1024.00
--      7 16384.12    186449-217788        31340    107916150 13174.34     12039193   137.78  1024.00
--      8 16384.06    217789-249176        31388    107913049 13173.97     12067070   138.10  1024.00
--      9 16384.27    249177-280541        31365    107908427 13173.40     12133991   138.86  1024.00
--     10 16384.23    280542-311953        31412    107899868 13172.36     12222358   139.87  1024.00
--     11 16384.14    311954-343448        31495    107911685 13173.80     12087914   138.34  1024.00
--     12 16384.29    343449-375034        31586    107886485 13170.73     12370018   141.56  1024.00
--     13 16384.33    375035-406689        31655    107893730 13171.62     12295645   140.71  1024.00
--     14 16384.11    406690-438376        31687    107899343 13172.30     12216091   139.80  1024.00
--     15 16384.22    438377-470002        31626    107890273 13171.19     12323071   141.03  1024.00
--     16 16384.05    470003-501648        31646    107892425 13171.46     12284957   140.59  1024.00
--     17 16384.36    501649-533373        31725    107894302 13171.69     12292180   140.67  1024.00
--     18 16384.17    533374-565129        31756    107895566 13171.84     12261633   140.32  1024.00
--     19 16384.36    565130-596893        31764    107903080 13172.76     12198291   139.60  1024.00
--     20 16384.22    596894-628625        31732    107890035 13171.17     12325387   141.05  1024.00
--     21 16384.13    628626-660328        31703    107898052 13172.15     12231536   139.98  1024.00
--     22 16384.19    660329-692088        31760    107897918 13172.13     12238756   140.06  1024.00
--     23 16384.30    692089-723896        31808    107887939 13170.92     12354706   141.39  1024.00
--     24 16384.11    723897-755668        31772    107894500 13171.72     12268108   140.40  1024.00
--     25 16384.12    755669-787487        31819    107897947 13172.14     12232154   139.99  1024.00
--     26 16384.03    787488-819204        31717    107904231 13172.90     12157413   139.13  1024.00
--     27 16384.21    819205-850927        31723    107902476 13172.69     12191821   139.52  1024.00
--     28 16384.39    850928-882664        31737    107902874 13172.74     12202882   139.65  1024.00
--     29 16384.47    882665-914343        31679    107903128 13172.77     12207916   139.71  1024.00
--     30 16384.21    914344-946061        31718    107892256 13171.44     12300705   140.77  1024.00
--     31 16384.11    946062-977667        31606    107896050 13171.90     12251517   140.21  1024.00
--     32 16384.19    977668-1009244       31577    107915918 13174.32     12046642   137.86  1024.00
--     33 16384.05   1009245-1040737       31493    107903471 13172.80     12167849   139.25  1024.00
--     34 16384.26   1040738-1072143       31406    107898301 13172.17     12241526   140.09  1024.00
--     35 16384.24   1072144-1103276       31133    107919244 13174.72     12017228   137.53  1024.00
--     36 16384.23   1103277-1134308       31032    107914585 13174.14     12066151   138.09  1024.00
--     37 16384.33   1134309-1165385       31077    107907447 13173.27     12150623   139.05  1024.00
--     38 16384.16   1165386-1196415       31030    107924899 13175.40     11949628   136.75  1024.00
--     39 16384.11   1196416-1227747       31332    107897077 13172.02     12241320   140.09  1024.00
--     40 16384.12   1227748-1259606       31859    107885851 13170.66     12360392   141.45  1024.00
--     41 16384.38   1259607-1291277       31671    107901916 13172.62     12212352   139.76  1024.00
--     42 16384.05   1291278-1322764       31487    107908664 13173.44     12112446   138.62  1024.00
--     43 16384.21   1322765-1354188       31424    107909987 13173.59     12112532   138.62  1024.00
--     44 16384.14   1354189-1385511       31323    107909794 13173.57     12108533   138.57  1024.00
--     45 16384.27   1385512-1416819       31308    107909238 13173.50     12126359   138.78  1024.00
--     46 16384.19   1416820-1448208       31389    107907868 13173.34     12133108   138.85  1024.00
--     47 16384.13   1448209-1479623       31415    107907862 13173.34     12128059   138.79  1024.00
--     48 16384.12   1479624-1511050       31427    107885675 13170.63     12363449   141.49  1024.00
--     49 16384.15   1511051-1542535       31485    107905606 13173.06     12154107   139.09  1024.00
--     50 16384.05   1542536-1573975       31440    107913287 13174.00     12063126   138.05  1024.00
--     51 16384.46   1573976-1605487       31512    107899353 13172.30     12247500   140.16  1024.00
--     52 16384.25   1605488-1637027       31540    107898509 13172.20     12238038   140.05  1024.00
--     53 16384.26   1637028-1668724       31697    107881014 13170.07     12425056   142.19  1024.00
--     54 16384.17   1668725-1700332       31608    107896444 13171.95     12252711   140.22  1024.00
--     55 16384.12   1700333-1731976       31644    107890324 13171.20     12313547   140.92  1024.00
--     56 16384.10   1731977-1763559       31583    107895554 13171.84     12255876   140.26  1024.00
--     57 16384.32   1763560-1795087       31528    107909371 13173.52     12128108   138.80  1024.00
--     58 16384.20   1795088-1826588       31501    107906790 13173.21     12145142   138.99  1024.00
--     59 16384.13   1826589-1858124       31536    107905698 13173.07     12151248   139.06  1024.00
--     60 16384.23   1858125-1889688       31564    107897120 13172.03     12250751   140.20  1024.00
--     61 16384.37   1889689-1921224       31536    107898228 13172.16     12251859   140.21  1024.00
--     62 16384.12   1921225-1952739       31515    107901241 13172.53     12197419   139.59  1024.00
--     63 16384.04   1952740-1984237       31498    107903532 13172.81     12166514   139.23  1024.00
--     64 16384.06   1984238-2015783       31546    107899919 13172.37     12205936   139.69  1024.00
--     65 16384.19   2015784-2047323       31540    107901653 13172.58     12198885   139.61  1024.00
--     66 16384.07   2047324-2078832       31509    107902740 13172.71     12177212   139.36  1024.00
--     67 16384.39   2078833-2110347       31515    107916522 13174.40     12058430   138.00  1024.00
--     68 16384.49   2110348-2141840       31493    107908547 13173.42     12152355   139.07  1024.00
--     69 16384.07   2141841-2173270       31430    107896476 13171.95     12244167   140.12  1024.00
--     70 16384.28   2173271-2204747       31477    107888876 13171.02     12343889   141.26  1024.00
--     71 16384.46   2204748-2236188       31441    107899297 13172.29     12248539   140.17  1024.00
--     72 16384.30   2236189-2267676       31488    107900947 13172.49     12216805   139.81  1024.00
--     73 16384.45   2267677-2299089       31413    107895493 13171.83     12287794   140.62  1024.00
--     74 16384.27   2299090-2330466       31377    107943846 13177.73     11756542   134.54  1024.00
--     75  4299.45   2330467-2333307        2841      9989543  1219.52       693100     7.93  1024.00
--   ---- -------- ------------------- --------- ------------ -------- ------------ -------- --------
--                                                 7994890136             901687692
-- Finished stage 'readErrorDetectionConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'red' concurrent execution on Thu Jan 11 13:17:36 2024 with 332115.718 GB free disk space (75 processes; 2 concurrently)

    cd unitigging/3-overlapErrorAdjustment
    ./red.sh 1 > ./red.000001.out 2>&1
    ./red.sh 2 > ./red.000002.out 2>&1
    ./red.sh 3 > ./red.000003.out 2>&1
    ./red.sh 4 > ./red.000004.out 2>&1
    ./red.sh 5 > ./red.000005.out 2>&1
    ./red.sh 6 > ./red.000006.out 2>&1
    ./red.sh 7 > ./red.000007.out 2>&1
    ./red.sh 8 > ./red.000008.out 2>&1
    ./red.sh 9 > ./red.000009.out 2>&1
    ./red.sh 10 > ./red.000010.out 2>&1
    ./red.sh 11 > ./red.000011.out 2>&1
    ./red.sh 12 > ./red.000012.out 2>&1
    ./red.sh 13 > ./red.000013.out 2>&1
    ./red.sh 14 > ./red.000014.out 2>&1
    ./red.sh 15 > ./red.000015.out 2>&1
    ./red.sh 16 > ./red.000016.out 2>&1
    ./red.sh 17 > ./red.000017.out 2>&1
    ./red.sh 18 > ./red.000018.out 2>&1
    ./red.sh 19 > ./red.000019.out 2>&1
    ./red.sh 20 > ./red.000020.out 2>&1
    ./red.sh 21 > ./red.000021.out 2>&1
    ./red.sh 22 > ./red.000022.out 2>&1
    ./red.sh 23 > ./red.000023.out 2>&1
    ./red.sh 24 > ./red.000024.out 2>&1
    ./red.sh 25 > ./red.000025.out 2>&1
    ./red.sh 26 > ./red.000026.out 2>&1
    ./red.sh 27 > ./red.000027.out 2>&1
    ./red.sh 28 > ./red.000028.out 2>&1
    ./red.sh 29 > ./red.000029.out 2>&1
    ./red.sh 30 > ./red.000030.out 2>&1
    ./red.sh 31 > ./red.000031.out 2>&1
    ./red.sh 32 > ./red.000032.out 2>&1
    ./red.sh 33 > ./red.000033.out 2>&1
    ./red.sh 34 > ./red.000034.out 2>&1
    ./red.sh 35 > ./red.000035.out 2>&1
    ./red.sh 36 > ./red.000036.out 2>&1
    ./red.sh 37 > ./red.000037.out 2>&1
    ./red.sh 38 > ./red.000038.out 2>&1
    ./red.sh 39 > ./red.000039.out 2>&1
    ./red.sh 40 > ./red.000040.out 2>&1
    ./red.sh 41 > ./red.000041.out 2>&1
    ./red.sh 42 > ./red.000042.out 2>&1
    ./red.sh 43 > ./red.000043.out 2>&1
    ./red.sh 44 > ./red.000044.out 2>&1
    ./red.sh 45 > ./red.000045.out 2>&1
    ./red.sh 46 > ./red.000046.out 2>&1
    ./red.sh 47 > ./red.000047.out 2>&1
    ./red.sh 48 > ./red.000048.out 2>&1
    ./red.sh 49 > ./red.000049.out 2>&1
    ./red.sh 50 > ./red.000050.out 2>&1
    ./red.sh 51 > ./red.000051.out 2>&1
    ./red.sh 52 > ./red.000052.out 2>&1
    ./red.sh 53 > ./red.000053.out 2>&1
    ./red.sh 54 > ./red.000054.out 2>&1
    ./red.sh 55 > ./red.000055.out 2>&1
    ./red.sh 56 > ./red.000056.out 2>&1
    ./red.sh 57 > ./red.000057.out 2>&1
    ./red.sh 58 > ./red.000058.out 2>&1
    ./red.sh 59 > ./red.000059.out 2>&1
    ./red.sh 60 > ./red.000060.out 2>&1
    ./red.sh 61 > ./red.000061.out 2>&1
    ./red.sh 62 > ./red.000062.out 2>&1
    ./red.sh 63 > ./red.000063.out 2>&1
    ./red.sh 64 > ./red.000064.out 2>&1
    ./red.sh 65 > ./red.000065.out 2>&1
    ./red.sh 66 > ./red.000066.out 2>&1
    ./red.sh 67 > ./red.000067.out 2>&1
    ./red.sh 68 > ./red.000068.out 2>&1
    ./red.sh 69 > ./red.000069.out 2>&1
    ./red.sh 70 > ./red.000070.out 2>&1
    ./red.sh 71 > ./red.000071.out 2>&1
    ./red.sh 72 > ./red.000072.out 2>&1
    ./red.sh 73 > ./red.000073.out 2>&1
    ./red.sh 74 > ./red.000074.out 2>&1
    ./red.sh 75 > ./red.000075.out 2>&1

-- Finished on Fri Jan 12 04:08:49 2024 (53473 seconds) with 329892.253 GB free disk space
----------------------------------------
-- Found 75 read error detection output files.
-- Finished stage 'readErrorDetectionCheck', reset canuIteration.
--
-- Loading read lengths.
-- Loading number of overlaps per read.
--
-- Configure OEA for 8gb memory.
--                   Batches of at most (unlimited) reads.
--                                      300000000 bases.
--
--           Total                                               Reads                 Olaps  Adjusts
--    Job   Memory      Read Range         Reads        Bases   Memory        Olaps   Memory   Memory  (Memory in MB)
--   ---- -------- ------------------- --------- ------------ -------- ------------ -------- --------
--      1  3745.68         1-86453         86453    300000640   296.74     32653102   996.49   404.44
--      2  3773.88     86454-172729        86276    300002104   296.74     33577213  1024.70   404.44
--      3  3773.31    172730-259831        87102    300002890   296.76     33557791  1024.10   404.44
--      4  3780.04    259832-347201        87370    300002560   296.77     33777976  1030.82   404.44
--      5  3792.96    347202-435203        88002    300001642   296.79     34200957  1043.73   404.44
--      6  3793.55    435204-523253        88050    300001539   296.79     34220246  1044.32   404.44
--      7  3786.78    523254-611518        88265    300000864   296.80     33997970  1037.54   404.44
--      8  3790.38    611519-699752        88234    300001553   296.80     34115934  1041.14   404.44
--      9  3792.79    699753-788194        88442    300002293   296.80     34194910  1043.55   404.44
--     10  3782.71    788195-876399        88205    300000840   296.79     33864914  1033.48   404.44
--     11  3786.27    876400-964490        88091    300002962   296.79     33981518  1037.03   404.44
--     12  3782.70    964491-1052145       87655    300000425   296.78     33864904  1033.47   404.44
--     13  3773.99   1052146-1138770       86625    300003886   296.75     33580476  1024.79   404.44
--     14  3777.54   1138771-1225336       86566    300003876   296.75     33696999  1028.35   404.44
--     15  3789.47   1225337-1313454       88118    300001529   296.79     34086319  1040.23   404.44
--     16  3775.54   1313455-1400660       87206    300001490   296.77     33630728  1026.33   404.44
--     17  3778.09   1400661-1487909       87249    300001174   296.77     33714319  1028.88   404.44
--     18  3783.53   1487910-1575360       87451    300002538   296.77     33892346  1034.31   404.44
--     19  3795.33   1575361-1663163       87803    300000033   296.78     34278840  1046.11   404.44
--     20  3788.09   1663164-1751091       87928    300003930   296.79     34041261  1038.86   404.44
--     21  3780.76   1751092-1838741       87650    300001014   296.78     33801472  1031.54   404.44
--     22  3788.01   1838742-1926446       87705    300000229   296.78     34038963  1038.79   404.44
--     23  3783.99   1926447-2014097       87651    300002815   296.78     33907396  1034.77   404.44
--     24  3780.43   2014098-2101728       87631    300002540   296.78     33790474  1031.20   404.44
--     25  3785.75   2101729-2189247       87519    300002102   296.78     33965127  1036.53   404.44
--     26  3790.29   2189248-2276686       87439    300003411   296.77     34113809  1041.07   404.44
--     27  3293.18   2276687-2333307       56621    194839257   195.54     21141728   645.19   404.44
--   ---- -------- ------------------- --------- ------------ -------- ------------ -------- --------
--                                                 7994890136             901687692
-- Finished stage 'overlapErrorAdjustmentConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'oea' concurrent execution on Fri Jan 12 04:09:10 2024 with 329892.125 GB free disk space (27 processes; 8 concurrently)

    cd unitigging/3-overlapErrorAdjustment
    ./oea.sh 1 > ./oea.000001.out 2>&1
    ./oea.sh 2 > ./oea.000002.out 2>&1
    ./oea.sh 3 > ./oea.000003.out 2>&1
    ./oea.sh 4 > ./oea.000004.out 2>&1
    ./oea.sh 5 > ./oea.000005.out 2>&1
    ./oea.sh 6 > ./oea.000006.out 2>&1
    ./oea.sh 7 > ./oea.000007.out 2>&1
    ./oea.sh 8 > ./oea.000008.out 2>&1
    ./oea.sh 9 > ./oea.000009.out 2>&1
    ./oea.sh 10 > ./oea.000010.out 2>&1
    ./oea.sh 11 > ./oea.000011.out 2>&1
    ./oea.sh 12 > ./oea.000012.out 2>&1
    ./oea.sh 13 > ./oea.000013.out 2>&1
    ./oea.sh 14 > ./oea.000014.out 2>&1
    ./oea.sh 15 > ./oea.000015.out 2>&1
    ./oea.sh 16 > ./oea.000016.out 2>&1
    ./oea.sh 17 > ./oea.000017.out 2>&1
    ./oea.sh 18 > ./oea.000018.out 2>&1
    ./oea.sh 19 > ./oea.000019.out 2>&1
    ./oea.sh 20 > ./oea.000020.out 2>&1
    ./oea.sh 21 > ./oea.000021.out 2>&1
    ./oea.sh 22 > ./oea.000022.out 2>&1
    ./oea.sh 23 > ./oea.000023.out 2>&1
    ./oea.sh 24 > ./oea.000024.out 2>&1
    ./oea.sh 25 > ./oea.000025.out 2>&1
    ./oea.sh 26 > ./oea.000026.out 2>&1
    ./oea.sh 27 > ./oea.000027.out 2>&1

-- Finished on Fri Jan 12 10:18:27 2024 (22157 seconds, at least I didn't crash) with 329542.483 GB free disk space
----------------------------------------
-- Found 27 overlap error adjustment output files.
-- Finished stage 'overlapErrorAdjustmentCheck', reset canuIteration.
----------------------------------------
-- Starting command on Fri Jan 12 10:18:27 2024 with 329542.483 GB free disk space

    cd unitigging/3-overlapErrorAdjustment
    /mnt/shared/scratch/tadams/smrtrenseq_assembly/.snakemake/conda/d198b113b159549338ae9bea44596cd3_/bin/loadErates \
      -S ../../R5_Diff.seqStore \
      -O ../R5_Diff.ovlStore \
      -L ./oea.files \
    > ./oea.apply.err 2>&1

-- Finished on Fri Jan 12 10:18:39 2024 (12 seconds) with 329541.298 GB free disk space
----------------------------------------
-- No report available.
-- Finished stage 'updateOverlapStore', reset canuIteration.
-- Finished stage 'unitig', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'bat' concurrent execution on Fri Jan 12 10:18:39 2024 with 329541.298 GB free disk space (1 processes; 1 concurrently)

    cd unitigging/4-unitigger
    ./unitigger.sh 1 > ./unitigger.000001.out 2>&1

-- Finished on Fri Jan 12 10:18:45 2024 (6 seconds) with 329541.298 GB free disk space
----------------------------------------
-- Unitigger finished successfully.
-- Finished stage 'unitigCheck', reset canuIteration.
----------------------------------------
-- Starting command on Fri Jan 12 10:18:45 2024 with 329541.298 GB free disk space

    cd unitigging
    /mnt/shared/scratch/tadams/smrtrenseq_assembly/.snakemake/conda/d198b113b159549338ae9bea44596cd3_/bin/utgcns \
      -S ../R5_Diff.seqStore \
      -T  ./R5_Diff.ctgStore 1 \
      -partition 0.8 1.5 0.1 \
    > ./R5_Diff.ctgStore/partitioning.log 2>&1
sh: line 1: ./R5_Diff.ctgStore/partitioning.log: No such file or directory

-- Finished on Fri Jan 12 10:18:45 2024 (in the blink of an eye) with 329541.298 GB free disk space
----------------------------------------

ERROR:
ERROR:  Failed with exit code 1.  (rc=256)
ERROR:

ABORT:
ABORT: canu 2.2
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting.  If that doesn't work, ask for help.
ABORT:
ABORT:   failed to partition the reads.
ABORT:
skoren commented 10 months ago

You've got really high coverage, over 2000x which isn't ideal for assembly. I suspect this step is running out of memory to load that many overlaps. You can check the unitigging/4-unitigger/unitigger.000001.out to confirm that, post the contents of that file here. Is there a particular reason you're setting the max coverage to 2000x? We recommend 50x for HiFi data typically.

TMAdams commented 10 months ago

Hi Sergey,

We're using enriched data which has gone through a PCR step, so we set it high to avoid the random downsampling step, otherwise we find we lose information for the final assembly.

Looking at that log file, it doesn't look to be reporting any errors, full contents:

Found perl:
   /mnt/shared/scratch/tadams/smrtrenseq_assembly/.snakemake/conda/d198b113b159549338ae9bea44596cd3_/bin/perl
   This is perl 5, version 32, subversion 1 (v5.32.1) built for x86_64-linux-thread-multi

Found java:
   /mnt/shared/scratch/tadams/smrtrenseq_assembly/.snakemake/conda/d198b113b159549338ae9bea44596cd3_/bin/java
   openjdk version "11.0.9.1-internal" 2020-11-04

Found canu:
   /mnt/shared/scratch/tadams/smrtrenseq_assembly/.snakemake/conda/d198b113b159549338ae9bea44596cd3_/bin/canu
   canu 2.2

Running job 1 based on command line options.

The job itself is running within slurm, which reports peak RSS of 20.9G and I provide it with 32G, so plenty of headroom. Slurm also isn't reporting it as an out of memory kill, but can't rule that out as I've seen that happen before. I'll try upping the memory and see if that helps.

skoren commented 10 months ago

Ah, I see, that's why the histogram looks off.

There should be more in the log than what you're seeing. Is there any unitigger.err files in the 4-unitigger folder? Can you post that if it exists?

TMAdams commented 10 months ago

I actually just cleared out the last run, I've got a re-run on now, but checking runs that worked the .out log looked the same. Once this re-run with higher memory finishes I'll check for those logs and come back here.

Thanks for getting back so quick!

TMAdams commented 10 months ago

Hi Sergey,

Just to update you, there was indeed an err log in the unitigging folder, this seems to suggest that the step is running out of memory, rather than slurm killing the job due to a lack of memory. I'll try upping the maxMemory option when running the job, hadn't spotted this was an option!

For reference, full error log from unitigging here:

==> PARAMETERS.

Resources:
  Memory                16 GB
  Compute Threads       4

Lengths:
  Minimum read          0 bases
  Maximum read          4294967295 bases
  Minimum overlap       500 bases

Overlap Error Rates:
  Graph                 0.000 (0.030%)
  Max                   0.000 (0.030%)
  Forced                -.--- (-.---%)   (not used)

Deviations:
  Graph                 12.000
  Bubble                1.000
  Repeat                1.000

Similarity Thresholds:
  Graph                 0.000
  Bubble                0.010
  Repeat                0.010

Edge Confusion:
  Absolute              2500
  Percent               15.0000

Unitig Construction:
  Minimum intersection  500 bases
  Maxiumum placements   2 positions

Debugging Enabled:
  (none)

==> LOADING AND FILTERING OVERLAPS.

ReadInfo()-- Found     2333307 reads.

OverlapCache()-- limited to 16384MB memory (user supplied).

OverlapCache()--      17MB for read data.
OverlapCache()--      71MB for best edges.
OverlapCache()--     231MB for tigs.
OverlapCache()--      62MB for tigs - read layouts.
OverlapCache()--      89MB for tigs - error profiles.
OverlapCache()--    4096MB for tigs - error profile overlaps.
OverlapCache()--       0MB for other processes.
OverlapCache()-- ---------
OverlapCache()--    4612MB for data structures (sum of above).
OverlapCache()-- ---------
OverlapCache()--      44MB for overlap store structure.
OverlapCache()--   11727MB for overlap data.
OverlapCache()-- ---------
OverlapCache()--   16384MB allowed.
OverlapCache()--
OverlapCache()-- Retain at least 3700 overlaps/read, based on 1850.00x coverage.
OverlapCache()-- Initial guess at 329 overlaps/read.
OverlapCache()--
OverlapCache()-- Adjusting for sparse overlaps.
OverlapCache()--
OverlapCache()--               reads loading olaps          olaps               memory
OverlapCache()--   olaps/read       all      some          loaded                 free
OverlapCache()--   ----------   -------   -------     ----------- -------     --------
OverlapCache()--          329   1304496   1028811       499073246  55.35%       4111 MB
OverlapCache()--          590   1865632    467675       681437042  75.57%       1329 MB
OverlapCache()--          776   2042294    291013       750417834  83.22%        276 MB
OverlapCache()--          838   2086895    246412       767074049  85.07%         22 MB
OverlapCache()--          844   2090452    242855       768543602  85.23%          0 MB
OverlapCache()-- Not enough memory to load the minimum number of overlaps; increase -M.
skoren commented 10 months ago

Update batMemory to 32 or 64 (it's currently using 16).

TMAdams commented 10 months ago

Will do, would updating the maxMemory flag have the same effect in case this errors further down the pipeline?

Just submitted an updated run, will update you when it finishes.

Thanks again for the help!

skoren commented 10 months ago

I don't think maxMemory would be enough as that just sets the allowed maximum but it won't force this step to use more memory like batMemory will.

TMAdams commented 10 months ago

Hi, just to follow up on this, increasing batMemory has meant the assemblies finish correctly.

Thanks for the help!