marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
654 stars 179 forks source link

failed to find the number of jobs in 'unitigging/0-mercounts/meryl-count.sh' #2272

Closed katievigil closed 10 months ago

katievigil commented 11 months ago

Hi , I am having an issue running this barcode, it does not have alot of reads, so it could be that no contigs will be assembled, but I just wanted to double check with you. I am doing metagenomic viral sequencing using nanopore. Thanks!

Detected Slurm with 'sinfo' binary in /cm/shared/apps/slurm/14.03.0/bin/sinfo.
--          Slurm disabled by useGrid=false
--
-- Local machine mode enabled; grid support not detected or not allowed.
--
--                                (tag)Concurrency
--                         (tag)Threads          |
--                (tag)Memory         |          |
--        (tag)             |         |          |       total usage      algorithm
--        -------  ----------  --------   --------  --------------------  -----------------------------
-- Local: meryl     12.000 GB    4 CPUs x   5 jobs    60.000 GB  20 CPUs  (k-mer counting)
-- Local: hap        8.000 GB    4 CPUs x   5 jobs    40.000 GB  20 CPUs  (read-to-haplotype assignment)
-- Local: cormhap    6.000 GB   10 CPUs x   2 jobs    12.000 GB  20 CPUs  (overlap detection with mhap)
-- Local: obtovl     4.000 GB    5 CPUs x   4 jobs    16.000 GB  20 CPUs  (overlap detection)
-- Local: utgovl     4.000 GB    5 CPUs x   4 jobs    16.000 GB  20 CPUs  (overlap detection)
-- Local: cor        -.--- GB    4 CPUs x   - jobs     -.--- GB   - CPUs  (read correction)
-- Local: ovb        4.000 GB    1 CPU  x  20 jobs    80.000 GB  20 CPUs  (overlap store bucketizer)
-- Local: ovs        8.000 GB    1 CPU  x  20 jobs   160.000 GB  20 CPUs  (overlap store sorting)
-- Local: red       32.000 GB    4 CPUs x   5 jobs   160.000 GB  20 CPUs  (read error detection)
-- Local: oea       32.000 GB    1 CPU  x  20 jobs   640.000 GB  20 CPUs  (overlap error adjustment)
-- Local: bat       32.000 GB    4 CPUs x   1 job     32.000 GB   4 CPUs  (contig construction with bogart)
-- Local: cns        -.--- GB    4 CPUs x   - jobs     -.--- GB   - CPUs  (consensus)
--
-- Found untrimmed raw Nanopore reads in the input files.
--
-- Generating assembly 'barcode03' in '/lustre/project/taw/kvigil/ONR/baratariabay/ONR_baratariabay100623/20231006_1648_MN18851_FAW76720_acec0fdf/fastq_pass/concatenate/canu/barcode03':
--   genomeSize:
--     2000000
--
--   Overlap Generation Limits:
--     corOvlErrorRate 0.3200 ( 32.00%)
--     obtOvlErrorRate 0.2000 ( 20.00%)
--     utgOvlErrorRate 0.2000 ( 20.00%)
--
--   Overlap Processing Limits:
--     corErrorRate    0.3000 ( 30.00%)
--     obtErrorRate    0.2000 ( 20.00%)
--     utgErrorRate    0.2000 ( 20.00%)
--     cnsErrorRate    0.2000 ( 20.00%)
--
--   Stages to run:
--     correct raw reads.
--     trim corrected reads.
--     assemble corrected and trimmed reads.
--
--
-- BEGIN CORRECTION
----------------------------------------
-- Starting command on Fri Oct 20 09:28:17 2023 with 148203.835 GB free disk space

    cd .
    ./barcode03.seqStore.sh \
    > ./barcode03.seqStore.err 2>&1

-- Finished on Fri Oct 20 09:28:17 2023 (lickety-split) with 148203.773 GB free disk space
----------------------------------------
--
-- In sequence store './barcode03.seqStore':
--   Found 1043 reads.
--   Found 1338011 bases (0.66 times coverage).
--    Histogram of raw reads:
--
--    G=1338011                          sum of  ||               length     num
--    NG         length     index       lengths  ||                range    seqs
--    ----- ------------ --------- ------------  ||  ------------------- -------
--    00010         1835        54       135405  ||       1000-1084          297|---------------------------------------------------------------
--    00020         1540       133       267885  ||       1085-1169          221|-----------------------------------------------
--    00030         1398       225       401848  ||       1170-1254          133|-----------------------------
--    00040         1294       325       536324  ||       1255-1339          110|------------------------
--    00050         1227       431       670004  ||       1340-1424           84|------------------
--    00060         1158       543       803461  ||       1425-1509           52|------------
--    00070         1109       661       936862  ||       1510-1594           28|------
--    00080         1072       784      1071052  ||       1595-1679           27|------
--    00090         1035       911      1204892  ||       1680-1764           20|-----
--    00100         1000      1042      1338011  ||       1765-1849           21|-----
--    001.000x                1043      1338011  ||       1850-1934           10|---
--                                               ||       1935-2019            6|--
--                                               ||       2020-2104            1|-
--                                               ||       2105-2189            1|-
--                                               ||       2190-2274            6|--
--                                               ||       2275-2359            1|-
--                                               ||       2360-2444            2|-
--                                               ||       2445-2529            6|--
--                                               ||       2530-2614            2|-
--                                               ||       2615-2699            3|-
--                                               ||       2700-2784            3|-
--                                               ||       2785-2869            1|-
--                                               ||       2870-2954            1|-
--                                               ||       2955-3039            1|-
--                                               ||       3040-3124            0|
--                                               ||       3125-3209            0|
--                                               ||       3210-3294            0|
--                                               ||       3295-3379            1|-
--                                               ||       3380-3464            1|-
--                                               ||       3465-3549            0|
--                                               ||       3550-3634            0|
--                                               ||       3635-3719            0|
--                                               ||       3720-3804            0|
--                                               ||       3805-3889            0|
--                                               ||       3890-3974            0|
--                                               ||       3975-4059            0|
--                                               ||       4060-4144            0|
--                                               ||       4145-4229            1|-
--                                               ||       4230-4314            0|
--                                               ||       4315-4399            0|
--                                               ||       4400-4484            0|
--                                               ||       4485-4569            0|
--                                               ||       4570-4654            0|
--                                               ||       4655-4739            0|
--                                               ||       4740-4824            1|-
--                                               ||       4825-4909            0|
--                                               ||       4910-4994            0|
--                                               ||       4995-5079            1|-
--                                               ||       5080-5164            0|
--                                               ||       5165-5249            1|-
--
----------------------------------------
-- Starting command on Fri Oct 20 09:28:18 2023 with 148203.773 GB free disk space

    cd correction/0-mercounts
    ./meryl-configure.sh \
    > ./meryl-configure.err 2>&1

-- Finished on Fri Oct 20 09:28:18 2023 (in the blink of an eye) with 148203.773 GB free disk space
----------------------------------------
--  segments   memory batches
--  -------- -------- -------
--        01  0.01 GB       2
--
--  For 1043 reads with 1338011 bases, limit to 1 batch.
--  Will count kmers using 01 jobs, each using 2 GB and 4 threads.
--
-- Finished stage 'merylConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'meryl' concurrent execution on Fri Oct 20 09:28:18 2023 with 148203.773 GB free disk space (1 processes; 5 concurrently)

    cd correction/0-mercounts
    ./meryl-count.sh 1 > ./meryl-count.000001.out 2>&1

-- Finished on Fri Oct 20 09:28:19 2023 (one second) with 148203.704 GB free disk space
----------------------------------------
-- Found 1 Kmer counting (meryl) outputs.
-- Finished stage 'cor-merylCountCheck', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'meryl' concurrent execution on Fri Oct 20 09:28:19 2023 with 148203.704 GB free disk space (1 processes; 5 concurrently)

    cd correction/0-mercounts
    ./meryl-process.sh 1 > ./meryl-process.000001.out 2>&1

-- Finished on Fri Oct 20 09:28:21 2023 (2 seconds) with 148203.577 GB free disk space
----------------------------------------
-- Meryl finished successfully.  Kmer frequency histogram:
--
--  16-mers                                                                                           Fraction
--    Occurrences   NumMers                                                                         Unique Total
--       1-     1         0                                                                        0.0000 0.0000
--       2-     2     40252 ********************************************************************** 0.9472 0.8671
--       3-     4      1825 ***                                                                    0.9815 0.9142
--       5-     7       269                                                                        0.9936 0.9379
--       8-    11        83                                                                        0.9973 0.9492
--      12-    16        24                                                                        0.9986 0.9552
--      17-    22        10                                                                        0.9991 0.9586
--      23-    29         6                                                                        0.9992 0.9602
--      30-    37         5                                                                        0.9994 0.9626
--      38-    46         4                                                                        0.9995 0.9638
--      47-    56         5                                                                        0.9996 0.9657
--      57-    67         1                                                                        0.9997 0.9685
--      68-    79         1                                                                        0.9997 0.9693
--      80-    92         1                                                                        0.9997 0.9703
--      93-   106         0                                                                        0.0000 0.0000
--     107-   121         0                                                                        0.0000 0.0000
--     122-   137         1                                                                        0.9998 0.9716
--     138-   154         2                                                                        0.9998 0.9732
--     155-   172         2                                                                        0.9998 0.9766
--     173-   191         2                                                                        0.9999 0.9803
--     192-   211         2                                                                        0.9999 0.9846
--     212-   232         0                                                                        0.0000 0.0000
--     233-   254         0                                                                        0.0000 0.0000
--     255-   277         0                                                                        0.0000 0.0000
--     278-   301         0                                                                        0.0000 0.0000
--     302-   326         0                                                                        0.0000 0.0000
--     327-   352         0                                                                        0.0000 0.0000
--     353-   379         0                                                                        0.0000 0.0000
--     380-   407         0                                                                        0.0000 0.0000
--     408-   436         0                                                                        0.0000 0.0000
--     437-   466         0                                                                        0.0000 0.0000
--     467-   497         0                                                                        0.0000 0.0000
--     498-   529         0                                                                        0.0000 0.0000
--     530-   562         0                                                                        0.0000 0.0000
--     563-   596         0                                                                        0.0000 0.0000
--     597-   631         2                                                                        1.0000 0.9932
--
--           0 (max occurrences)
--       92848 (total mers, non-unique)
--       42497 (distinct mers, non-unique)
--           0 (unique mers)
-- Finished stage 'meryl-process', reset canuIteration.
--
-- Removing meryl database 'correction/0-mercounts/barcode03.ms16'.
--
-- OVERLAPPER (mhap) (correction)
--
--
-- PARAMETERS: hashes=768, minMatches=2, threshold=0.73
--
-- Given 5.4 GB, can fit 8100 reads per block.
-- For 2 blocks, set stride to 2 blocks.
-- Logging partitioning to 'correction/1-overlapper/partitioning.log'.
-- Configured 1 mhap precompute jobs.
-- Configured 1 mhap overlap jobs.
-- Finished stage 'cor-mhapConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'cormhap' concurrent execution on Fri Oct 20 09:28:21 2023 with 148203.577 GB free disk space (1 processes; 2 concurrently)

    cd correction/1-overlapper
    ./precompute.sh 1 > ./precompute.000001.out 2>&1

-- Finished on Fri Oct 20 09:28:28 2023 (7 seconds) with 148203.089 GB free disk space
----------------------------------------
-- All 1 mhap precompute jobs finished successfully.
-- Finished stage 'cor-mhapPrecomputeCheck', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'cormhap' concurrent execution on Fri Oct 20 09:28:28 2023 with 148203.089 GB free disk space (1 processes; 2 concurrently)

    cd correction/1-overlapper
    ./mhap.sh 1 > ./mhap.000001.out 2>&1

-- Finished on Fri Oct 20 09:28:30 2023 (2 seconds) with 148203.016 GB free disk space
----------------------------------------
-- Found 1 mhap overlap output files.
-- Finished stage 'cor-mhapCheck', reset canuIteration.
----------------------------------------
-- Starting command on Fri Oct 20 09:28:30 2023 with 148203.016 GB free disk space

    cd correction
    /lustre/project/taw/share/conda-envs/ONRviral/bin/ovStoreConfig \
     -S ../barcode03.seqStore \
     -M 4-8 \
     -L ./1-overlapper/ovljob.files \
     -create ./barcode03.ovlStore.config \
     > ./barcode03.ovlStore.config.txt \
    2> ./barcode03.ovlStore.config.err

-- Finished on Fri Oct 20 09:28:30 2023 (furiously fast) with 148203.016 GB free disk space
----------------------------------------
--
-- Creating overlap store correction/barcode03.ovlStore using:
--      1 bucket
--      3 slices
--        using at most 1 GB memory each
-- Finished stage 'cor-overlapStoreConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'ovB' concurrent execution on Fri Oct 20 09:28:30 2023 with 148203.016 GB free disk space (1 processes; 20 concurrently)

    cd correction/barcode03.ovlStore.BUILDING
    ./scripts/1-bucketize.sh 1 > ./logs/1-bucketize.000001.out 2>&1

-- Finished on Fri Oct 20 09:28:30 2023 (in the blink of an eye) with 148203.016 GB free disk space
----------------------------------------
-- Overlap store bucketizer finished.
-- Finished stage 'cor-overlapStoreBucketizerCheck', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'ovS' concurrent execution on Fri Oct 20 09:28:30 2023 with 148203.016 GB free disk space (3 processes; 20 concurrently)

    cd correction/barcode03.ovlStore.BUILDING
    ./scripts/2-sort.sh 1 > ./logs/2-sort.000001.out 2>&1
    ./scripts/2-sort.sh 2 > ./logs/2-sort.000002.out 2>&1
    ./scripts/2-sort.sh 3 > ./logs/2-sort.000003.out 2>&1

-- Finished on Fri Oct 20 09:28:31 2023 (one second) with 148203.016 GB free disk space
----------------------------------------
-- Overlap store sorter finished.
-- Finished stage 'cor-overlapStoreSorterCheck', reset canuIteration.
----------------------------------------
-- Starting command on Fri Oct 20 09:28:31 2023 with 148203.016 GB free disk space

    cd correction
    /lustre/project/taw/share/conda-envs/ONRviral/bin/ovStoreIndexer \
      -O  ./barcode03.ovlStore.BUILDING \
      -S ../barcode03.seqStore \
      -C  ./barcode03.ovlStore.config \
      -delete \
    > ./barcode03.ovlStore.BUILDING.index.err 2>&1

-- Finished on Fri Oct 20 09:28:31 2023 (in the blink of an eye) with 148202.93 GB free disk space
----------------------------------------
-- Overlap store indexer finished.
-- Checking store.
----------------------------------------
-- Starting command on Fri Oct 20 09:28:31 2023 with 148202.93 GB free disk space

    cd correction
    /lustre/project/taw/share/conda-envs/ONRviral/bin/ovStoreDump \
     -S ../barcode03.seqStore \
     -O  ./barcode03.ovlStore \
     -counts \
     > ./barcode03.ovlStore/counts.dat 2> ./barcode03.ovlStore/counts.err

-- Finished on Fri Oct 20 09:28:31 2023 (in the blink of an eye) with 148202.93 GB free disk space
----------------------------------------
--
-- Overlap store 'correction/barcode03.ovlStore' successfully constructed.
-- Found 116 overlaps for 91 reads; 952 reads have no overlaps.
--
--
-- Purged 0.024 GB in 3 overlap output files.
-- Finished stage 'cor-createOverlapStore', reset canuIteration.
-- Computing correction layouts.
--   Local  filter coverage   20000
--   Global filter coverage   10000
----------------------------------------
-- Starting command on Fri Oct 20 09:28:31 2023 with 148202.93 GB free disk space

    cd correction
    /lustre/project/taw/share/conda-envs/ONRviral/bin/generateCorrectionLayouts \
      -S ../barcode03.seqStore \
      -O  ./barcode03.ovlStore \
      -C  ./barcode03.corStore.WORKING \
      -eC 20000 \
      -xC 10000 \
    > ./barcode03.corStore.err 2>&1

-- Finished on Fri Oct 20 09:28:31 2023 (fast as lightning) with 148202.93 GB free disk space
----------------------------------------
-- Finished stage 'cor-buildCorrectionLayoutsConfigure', reset canuIteration.
-- Computing correction layouts.
----------------------------------------
-- Starting command on Fri Oct 20 09:28:31 2023 with 148202.93 GB free disk space

    cd correction/2-correction
    /lustre/project/taw/share/conda-envs/ONRviral/bin/filterCorrectionLayouts \
      -S  ../../barcode03.seqStore \
      -C     ../barcode03.corStore \
      -R      ./barcode03.readsToCorrect.WORKING \
      -cc 0 \
      -cl 1000 \
      -g  2000000 \
      -c  10000 \
    > ./barcode03.readsToCorrect.err 2>&1

-- Finished on Fri Oct 20 09:28:31 2023 (in the blink of an eye) with 148202.93 GB free disk space
----------------------------------------
--                             original      original
--                            raw reads     raw reads
--   category                w/overlaps  w/o/overlaps
--   -------------------- ------------- -------------
--   Number of Reads                 91           952
--   Number of Bases             114008             0
--   Coverage                     0.057         0.000
--   Median                        1155             0
--   Mean                          1252             0
--   N50                           1210             0
--   Minimum                       1005             0
--   Maximum                       2477             0
--
--                                        --------corrected---------  ----------rescued----------
--                             evidence                     expected                     expected
--   category                     reads            raw     corrected            raw     corrected
--   -------------------- -------------  ------------- -------------  ------------- -------------
--   Number of Reads                 91             90            90              0             0
--   Number of Bases             114008         112664         57801              0             0
--   Coverage                     0.057          0.056         0.029          0.000         0.000
--   Median                        1155           1155           605              0             0
--   Mean                          1252           1251           642              0             0
--   N50                           1210           1203           669              0             0
--   Minimum                       1005           1005           247              0             0
--   Maximum                       2477           2477          1256              0             0
--
--                        --------uncorrected--------
--                                           expected
--   category                       raw     corrected
--   -------------------- ------------- -------------
--   Number of Reads                953           953
--   Number of Bases               1344          1333
--   Coverage                     0.001         0.001
--   Median                           0             0
--   Mean                             1             1
--   N50                              0             0
--   Minimum                          0             0
--   Maximum                       1344          1333
--
--   Maximum Memory           546546950
-- Finished stage 'cor-filterCorrectionLayouts', reset canuIteration.
--
-- Correction jobs estimated to need at most 0.509 GB for computation.
-- Correction jobs will request 6 GB each.
--
-- Local: cor        6.000 GB    4 CPUs x   5 jobs    30.000 GB  20 CPUs  (read correction)
--
--
-- Configuring correction jobs:
--   Reads estimated to need at most 0.509 GB for computation.
--   Jobs will request 6 GB each.
----------------------------------------
-- Starting command on Fri Oct 20 09:28:31 2023 with 148202.93 GB free disk space

    cd correction/2-correction
    ./correctReadsPartition.sh \
    > ./correctReadsPartition.err 2>&1

-- Finished on Fri Oct 20 09:28:31 2023 (like a bat out of hell) with 148202.93 GB free disk space
----------------------------------------
-- Finished stage 'cor-generateCorrectedReadsConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'cor' concurrent execution on Fri Oct 20 09:28:31 2023 with 148202.93 GB free disk space (1 processes; 5 concurrently)

    cd correction/2-correction
    ./correctReads.sh 1 > ./correctReads.000001.out 2>&1

-- Finished on Fri Oct 20 09:28:32 2023 (one second) with 148202.821 GB free disk space
----------------------------------------
-- Found 1 read correction output files.
-- Finished stage 'cor-generateCorrectedReadsCheck', reset canuIteration.
-- Found 1 read correction output files.
-- Finished stage 'cor-generateCorrectedReadsCheck', reset canuIteration.
--
-- Loading corrected reads into corStore and seqStore.
----------------------------------------
-- Starting command on Fri Oct 20 09:28:32 2023 with 148202.821 GB free disk space

    cd correction
    /lustre/project/taw/share/conda-envs/ONRviral/bin/loadCorrectedReads \
      -S ../barcode03.seqStore \
      -C ./barcode03.corStore \
      -L ./2-correction/corjob.files \
    >  ./barcode03.loadCorrectedReads.log \
    2> ./barcode03.loadCorrectedReads.err

-- Finished on Fri Oct 20 09:28:32 2023 (like a bat out of hell) with 148202.821 GB free disk space
----------------------------------------
--
-- In sequence store './barcode03.seqStore':
--   Found 90 reads.
--   Found 112664 bases (0.05 times coverage).
--    Histogram of corrected reads:
--
--    G=112664                           sum of  ||               length     num
--    NG         length     index       lengths  ||                range    seqs
--    ----- ------------ --------- ------------  ||  ------------------- -------
--    00010         1697         5        11700  ||       1005-1034           14|-----------------------------------------------------------
--    00020         1512        12        23026  ||       1035-1064            4|-----------------
--    00030         1347        20        34594  ||       1065-1094            6|--------------------------
--    00040         1255        29        46274  ||       1095-1124           15|---------------------------------------------------------------
--    00050         1182        38        57241  ||       1125-1154            6|--------------------------
--    00060         1142        47        67664  ||       1155-1184            7|------------------------------
--    00070         1111        57        78881  ||       1185-1214            3|-------------
--    00080         1089        68        90981  ||       1215-1244            4|-----------------
--    00090         1026        78       101496  ||       1245-1274            5|---------------------
--    00100         1005        89       112664  ||       1275-1304            1|-----
--    001.000x                  90       112664  ||       1305-1334            3|-------------
--                                               ||       1335-1364            2|---------
--                                               ||       1365-1394            0|
--                                               ||       1395-1424            1|-----
--                                               ||       1425-1454            2|---------
--                                               ||       1455-1484            2|---------
--                                               ||       1485-1514            3|-------------
--                                               ||       1515-1544            0|
--                                               ||       1545-1574            1|-----
--                                               ||       1575-1604            1|-----
--                                               ||       1605-1634            1|-----
--                                               ||       1635-1664            1|-----
--                                               ||       1665-1694            2|---------
--                                               ||       1695-1724            1|-----
--                                               ||       1725-1754            0|
--                                               ||       1755-1784            0|
--                                               ||       1785-1814            0|
--                                               ||       1815-1844            2|---------
--                                               ||       1845-1874            1|-----
--                                               ||       1875-1904            0|
--                                               ||       1905-1934            0|
--                                               ||       1935-1964            0|
--                                               ||       1965-1994            0|
--                                               ||       1995-2024            1|-----
--                                               ||       2025-2054            0|
--                                               ||       2055-2084            0|
--                                               ||       2085-2114            0|
--                                               ||       2115-2144            0|
--                                               ||       2145-2174            0|
--                                               ||       2175-2204            0|
--                                               ||       2205-2234            0|
--                                               ||       2235-2264            0|
--                                               ||       2265-2294            0|
--                                               ||       2295-2324            0|
--                                               ||       2325-2354            0|
--                                               ||       2355-2384            0|
--                                               ||       2385-2414            0|
--                                               ||       2415-2444            0|
--                                               ||       2445-2474            0|
--                                               ||       2475-2504            1|-----
--
--
-- Purging correctReads output after loading into stores.
-- Purged 1 .cns outputs.
-- Purged 2 .out job log outputs.
--
-- No corrected reads generated, overlaps used for correction saved.
-- Finished stage 'cor-loadCorrectedReads', reset canuIteration.
----------------------------------------
-- Starting command on Fri Oct 20 09:28:32 2023 with 148202.821 GB free disk space

    cd .
    /lustre/project/taw/share/conda-envs/ONRviral/bin/sqStoreDumpFASTQ \
      -corrected \
      -S ./barcode03.seqStore \
      -o ./barcode03.correctedReads.gz \
      -fasta \
      -nolibname \
    > barcode03.correctedReads.fasta.err 2>&1

-- Finished on Fri Oct 20 09:28:32 2023 (fast as lightning) with 148202.821 GB free disk space
----------------------------------------
--
-- Corrected reads saved in 'barcode03.correctedReads.fasta.gz'.
-- Finished stage 'cor-dumpCorrectedReads', reset canuIteration.
--
-- BEGIN TRIMMING
----------------------------------------
-- Starting command on Fri Oct 20 09:28:32 2023 with 148202.821 GB free disk space

    cd trimming/0-mercounts
    ./meryl-configure.sh \
    > ./meryl-configure.err 2>&1

-- Finished on Fri Oct 20 09:28:32 2023 (fast as lightning) with 148202.821 GB free disk space
----------------------------------------
--  segments   memory batches
--  -------- -------- -------
--        01  0.01 GB       2
--
--  For 90 reads with 112664 bases, limit to 1 batch.
--  Will count kmers using 01 jobs, each using 2 GB and 4 threads.
--
-- Finished stage 'merylConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'meryl' concurrent execution on Fri Oct 20 09:28:32 2023 with 148202.821 GB free disk space (1 processes; 5 concurrently)

    cd trimming/0-mercounts
    ./meryl-count.sh 1 > ./meryl-count.000001.out 2>&1

-- Finished on Fri Oct 20 09:28:34 2023 (2 seconds) with 148202.71 GB free disk space
----------------------------------------
-- Found 1 Kmer counting (meryl) outputs.
-- Finished stage 'obt-merylCountCheck', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'meryl' concurrent execution on Fri Oct 20 09:28:34 2023 with 148202.71 GB free disk space (1 processes; 5 concurrently)

    cd trimming/0-mercounts
    ./meryl-process.sh 1 > ./meryl-process.000001.out 2>&1

-- Finished on Fri Oct 20 09:28:35 2023 (one second) with 148202.659 GB free disk space
----------------------------------------
-- Meryl finished successfully.  Kmer frequency histogram:
--
--  22-mers                                                                                           Fraction
--    Occurrences   NumMers                                                                         Unique Total
--       1-     1         0                                                                        0.0000 0.0000
--       2-     2      3927 ********************************************************************** 0.9303 0.8911
--       3-     4       276 ****                                                                   0.9865 0.9717
--       5-     7        18                                                                        0.9993 0.9980
--
--           0 (max occurrences)
--        8814 (total mers, non-unique)
--        4221 (distinct mers, non-unique)
--           0 (unique mers)
-- Finished stage 'meryl-process', reset canuIteration.
--
-- Removing meryl database 'trimming/0-mercounts/barcode03.ms22'.
--
-- OVERLAPPER (normal) (trimming) erate=0.2
--
----------------------------------------
-- Starting command on Fri Oct 20 09:28:35 2023 with 148202.659 GB free disk space

    cd trimming/1-overlapper
    /lustre/project/taw/share/conda-envs/ONRviral/bin/overlapInCorePartition \
     -S  ../../barcode03.seqStore \
     -hl 80000000 \
     -rl 1000000000 \
     -ol 500 \
     -o  ./barcode03.partition \
    > ./barcode03.partition.err 2>&1

-- Finished on Fri Oct 20 09:28:35 2023 (in the blink of an eye) with 148202.659 GB free disk space
----------------------------------------
--
-- Configured 1 overlapInCore jobs.
-- Finished stage 'obt-overlapConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'obtovl' concurrent execution on Fri Oct 20 09:28:35 2023 with 148202.659 GB free disk space (1 processes; 4 concurrently)

    cd trimming/1-overlapper
    ./overlap.sh 1 > ./overlap.000001.out 2>&1

-- Finished on Fri Oct 20 09:28:36 2023 (one second) with 148202.581 GB free disk space
----------------------------------------
-- Found 1 overlapInCore output files.
--
-- overlapInCore compute 'trimming/1-overlapper':
--   kmer hits
--     with no overlap               68          68 +- 0
--     with an overlap               10          10 +- 0
--
--   overlaps                        10          10 +- 0
--     contained                      0           0 +- 0
--     dovetail                       0           0 +- 0
--
--   overlaps rejected
--     multiple per pair              0           0 +- 0
--     bad short window               0           0 +- 0
--     bad long window                0           0 +- 0
-- Finished stage 'obt-overlapCheck', reset canuIteration.
----------------------------------------
-- Starting command on Fri Oct 20 09:28:36 2023 with 148202.581 GB free disk space

    cd trimming
    /lustre/project/taw/share/conda-envs/ONRviral/bin/ovStoreConfig \
     -S ../barcode03.seqStore \
     -M 3 \
     -L ./1-overlapper/ovljob.files \
     -create ./barcode03.ovlStore.config \
     > ./barcode03.ovlStore.config.txt \
    2> ./barcode03.ovlStore.config.err

-- Finished on Fri Oct 20 09:28:36 2023 (furiously fast) with 148202.581 GB free disk space
----------------------------------------
--
-- Creating overlap store trimming/barcode03.ovlStore using:
--      1 bucket
--     20 slices
--        using at most 1 GB memory each
-- Finished stage 'obt-overlapStoreConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'ovB' concurrent execution on Fri Oct 20 09:28:36 2023 with 148202.581 GB free disk space (1 processes; 20 concurrently)

    cd trimming/barcode03.ovlStore.BUILDING
    ./scripts/1-bucketize.sh 1 > ./logs/1-bucketize.000001.out 2>&1

-- Finished on Fri Oct 20 09:28:36 2023 (like a bat out of hell) with 148202.581 GB free disk space
----------------------------------------
-- Overlap store bucketizer finished.
-- Finished stage 'obt-overlapStoreBucketizerCheck', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'ovS' concurrent execution on Fri Oct 20 09:28:36 2023 with 148202.581 GB free disk space (20 processes; 20 concurrently)

    cd trimming/barcode03.ovlStore.BUILDING
    ./scripts/2-sort.sh 1 > ./logs/2-sort.000001.out 2>&1
    ./scripts/2-sort.sh 2 > ./logs/2-sort.000002.out 2>&1
    ./scripts/2-sort.sh 3 > ./logs/2-sort.000003.out 2>&1
    ./scripts/2-sort.sh 4 > ./logs/2-sort.000004.out 2>&1
    ./scripts/2-sort.sh 5 > ./logs/2-sort.000005.out 2>&1
    ./scripts/2-sort.sh 6 > ./logs/2-sort.000006.out 2>&1
    ./scripts/2-sort.sh 7 > ./logs/2-sort.000007.out 2>&1
    ./scripts/2-sort.sh 8 > ./logs/2-sort.000008.out 2>&1
    ./scripts/2-sort.sh 9 > ./logs/2-sort.000009.out 2>&1
    ./scripts/2-sort.sh 10 > ./logs/2-sort.000010.out 2>&1
    ./scripts/2-sort.sh 11 > ./logs/2-sort.000011.out 2>&1
    ./scripts/2-sort.sh 12 > ./logs/2-sort.000012.out 2>&1
    ./scripts/2-sort.sh 13 > ./logs/2-sort.000013.out 2>&1
    ./scripts/2-sort.sh 14 > ./logs/2-sort.000014.out 2>&1
    ./scripts/2-sort.sh 15 > ./logs/2-sort.000015.out 2>&1
    ./scripts/2-sort.sh 16 > ./logs/2-sort.000016.out 2>&1
    ./scripts/2-sort.sh 17 > ./logs/2-sort.000017.out 2>&1
    ./scripts/2-sort.sh 18 > ./logs/2-sort.000018.out 2>&1
    ./scripts/2-sort.sh 19 > ./logs/2-sort.000019.out 2>&1
    ./scripts/2-sort.sh 20 > ./logs/2-sort.000020.out 2>&1

-- Finished on Fri Oct 20 09:28:38 2023 (2 seconds) with 148202.487 GB free disk space
----------------------------------------
-- Overlap store sorter finished.
-- Finished stage 'obt-overlapStoreSorterCheck', reset canuIteration.
----------------------------------------
-- Starting command on Fri Oct 20 09:28:38 2023 with 148202.487 GB free disk space

    cd trimming
    /lustre/project/taw/share/conda-envs/ONRviral/bin/ovStoreIndexer \
      -O  ./barcode03.ovlStore.BUILDING \
      -S ../barcode03.seqStore \
      -C  ./barcode03.ovlStore.config \
      -delete \
    > ./barcode03.ovlStore.BUILDING.index.err 2>&1

-- Finished on Fri Oct 20 09:28:38 2023 (fast as lightning) with 148202.487 GB free disk space
----------------------------------------
-- Overlap store indexer finished.
-- Checking store.
----------------------------------------
-- Starting command on Fri Oct 20 09:28:38 2023 with 148202.487 GB free disk space

    cd trimming
    /lustre/project/taw/share/conda-envs/ONRviral/bin/ovStoreDump \
     -S ../barcode03.seqStore \
     -O  ./barcode03.ovlStore \
     -counts \
     > ./barcode03.ovlStore/counts.dat 2> ./barcode03.ovlStore/counts.err

-- Finished on Fri Oct 20 09:28:38 2023 (lickety-split) with 148202.487 GB free disk space
----------------------------------------
--
-- Overlap store 'trimming/barcode03.ovlStore' successfully constructed.
-- Found 20 overlaps for 20 reads; 1023 reads have no overlaps.
--
--
-- Purged 0 GB in 3 overlap output files.
-- Finished stage 'obt-createOverlapStore', reset canuIteration.
----------------------------------------
-- Starting command on Fri Oct 20 09:28:38 2023 with 148202.487 GB free disk space

    cd trimming/3-overlapbasedtrimming
    /lustre/project/taw/share/conda-envs/ONRviral/bin/trimReads \
      -S  ../../barcode03.seqStore \
      -O  ../barcode03.ovlStore \
      -Co ./barcode03.1.trimReads.clear \
      -e  0.2 \
      -minlength 1000 \
      -ol 500 \
      -oc 2 \
      -o  ./barcode03.1.trimReads \
    >     ./barcode03.1.trimReads.err 2>&1

-- Finished on Fri Oct 20 09:28:39 2023 (one second) with 148202.436 GB free disk space
----------------------------------------
--  PARAMETERS:
--  ----------
--     1000    (reads trimmed below this many bases are deleted)
--   0.2000    (use overlaps at or below this fraction error)
--      500    (break region if overlap is less than this long, for 'largest covered' algorithm)
--        2    (break region if overlap coverage is less than this many reads, for 'largest covered' algorithm)
--
--  INPUT READS:
--  -----------
--    1043 reads       112664 bases (reads processed)
--       0 reads            0 bases (reads not processed, previously deleted)
--       0 reads            0 bases (reads not processed, in a library where trimming isn't allowed)
--
--  OUTPUT READS:
--  ------------
--       0 reads            0 bases (trimmed reads output)
--       0 reads            0 bases (reads with no change, kept as is)
--    1023 reads        87528 bases (reads with no overlaps, deleted)
--      20 reads        25136 bases (reads with short trimmed length, deleted)
--
--  TRIMMING DETAILS:
--  ----------------
--       0 reads            0 bases (bases trimmed from the 5' end of a read)
--       0 reads            0 bases (bases trimmed from the 3' end of a read)
-- Finished stage 'obt-trimReads', reset canuIteration.
----------------------------------------
-- Starting command on Fri Oct 20 09:28:39 2023 with 148202.436 GB free disk space

    cd trimming/3-overlapbasedtrimming
    /lustre/project/taw/share/conda-envs/ONRviral/bin/splitReads \
      -S  ../../barcode03.seqStore \
      -O  ../barcode03.ovlStore \
      -Ci ./barcode03.1.trimReads.clear \
      -Co ./barcode03.2.splitReads.clear \
      -e  0.2 \
      -minlength 1000 \
      -o  ./barcode03.2.splitReads \
    >     ./barcode03.2.splitReads.err 2>&1

-- Finished on Fri Oct 20 09:28:39 2023 (in the blink of an eye) with 148202.436 GB free disk space
----------------------------------------
--  PARAMETERS:
--  ----------
--     1000    (reads trimmed below this many bases are deleted)
--   0.2000    (use overlaps at or below this fraction error)
--  INPUT READS:
--  -----------
--       0 reads            0 bases (reads processed)
--    1043 reads       112664 bases (reads not processed, previously deleted)
--       0 reads            0 bases (reads not processed, in a library where trimming isn't allowed)
--
--  PROCESSED:
--  --------
--       0 reads            0 bases (no overlaps)
--       0 reads            0 bases (no coverage after adjusting for trimming done already)
--       0 reads            0 bases (processed for chimera)
--       0 reads            0 bases (processed for spur)
--       0 reads            0 bases (processed for subreads)
--
--  READS WITH SIGNALS:
--  ------------------
--       0 reads            0 signals (number of 5' spur signal)
--       0 reads            0 signals (number of 3' spur signal)
--       0 reads            0 signals (number of chimera signal)
--       0 reads            0 signals (number of subread signal)
--
--  SIGNALS:
--  -------
--       0 reads            0 bases (size of 5' spur signal)
--       0 reads            0 bases (size of 3' spur signal)
--       0 reads            0 bases (size of chimera signal)
--       0 reads            0 bases (size of subread signal)
--
--  TRIMMING:
--  --------
--       0 reads            0 bases (trimmed from the 5' end of the read)
--       0 reads            0 bases (trimmed from the 3' end of the read)
-- Finished stage 'obt-splitReads', reset canuIteration.
----------------------------------------
-- Starting command on Fri Oct 20 09:28:39 2023 with 148202.436 GB free disk space

    cd trimming/3-overlapbasedtrimming
    /lustre/project/taw/share/conda-envs/ONRviral/bin/loadTrimmedReads \
      -S ../../barcode03.seqStore \
      -c ./barcode03.2.splitReads.clear \
    > ./barcode03.loadTrimmedReads.err 2>&1

-- Finished on Fri Oct 20 09:28:39 2023 (lickety-split) with 148202.436 GB free disk space
----------------------------------------
--
-- Purging overlaps used for trimming.
-- Finished stage 'obt-dumpReads', reset canuIteration.
----------------------------------------
-- Starting command on Fri Oct 20 09:28:39 2023 with 148202.436 GB free disk space

    cd .
    /lustre/project/taw/share/conda-envs/ONRviral/bin/sqStoreDumpFASTQ \
      -trimmed \
      -S ./barcode03.seqStore \
      -o ./barcode03.trimmedReads.gz \
      -fasta \
      -trimmed -normal -nolibname \
    > ./barcode03.trimmedReads.fasta.err 2>&1

-- Finished on Fri Oct 20 09:28:39 2023 (fast as lightning) with 148202.436 GB free disk space
----------------------------------------
--
-- Trimmed reads saved in 'barcode03.trimmedReads.fasta.gz'.
-- Finished stage 'cor-dumpTrimmedReads', reset canuIteration.
--
-- BEGIN ASSEMBLY
----------------------------------------
-- Starting command on Fri Oct 20 09:28:39 2023 with 148202.436 GB free disk space

    cd unitigging/0-mercounts
    ./meryl-configure.sh \
    > ./meryl-configure.err 2>&1

-- Finished on Fri Oct 20 09:28:40 2023 (one second) with 148202.358 GB free disk space
----------------------------------------
--  segments   memory batches
--  -------- -------- -------
--
--  For 0 reads with 0 bases, limit to 0 batches.
--  Will count kmers using  jobs, each using  GB and 4 threads.
--
-- Finished stage 'merylConfigure', reset canuIteration.

ABORT:
ABORT: canu 2.2
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting.  If that doesn't work, ask for help.
ABORT:
ABORT:   failed to find the number of jobs in 'unitigging/0-mercounts/meryl-count.sh'.
ABORT:
(/lustre/project/taw/share/conda-envs/ONRviral) [kvigil@cypress01-121 concatenate]$ canu -p barcode03 -d /lustre/project/taw/kvigil/ONR/baratariabay/ONR_baratariabay100623/20231006_1648_MN18851_FAW76720_acec0fdf/fastq_pass/concatenate/canu/barcode03 genomeSize=2m minInputCoverage=0 maxInputCoverage=0 corOutCoverage=10000 stopOnLowCoverage=0 corMhapSensitivity=high corMinCoverage=0 redMemory=32 oeaMemory=32 batMemory=32 correctedErrorRate=0.2 useGrid=false -nanopore barcode03.fastq.gz
-- canu 2.2
--
-- CITATIONS
--
-- For 'standard' assemblies of PacBio or Nanopore reads:
--   Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM.
--   Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.
--   Genome Res. 2017 May;27(5):722-736.
--   http://doi.org/10.1101/gr.215087.116
--
-- Read and contig alignments during correction and consensus use:
--   Šošic M, Šikic M.
--   Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
--   Bioinformatics. 2017 May 1;33(9):1394-1395.
--   http://doi.org/10.1093/bioinformatics/btw753
--
-- Overlaps are generated using:
--   Berlin K, et al.
--   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
--   Nat Biotechnol. 2015 Jun;33(6):623-30.
--   http://doi.org/10.1038/nbt.3238
--
--   Myers EW, et al.
--   A Whole-Genome Assembly of Drosophila.
--   Science. 2000 Mar 24;287(5461):2196-204.
--   http://doi.org/10.1126/science.287.5461.2196
--
-- Corrected read consensus sequences are generated using an algorithm derived from FALCON-sense:
--   Chin CS, et al.
--   Phased diploid genome assembly with single-molecule real-time sequencing.
--   Nat Methods. 2016 Dec;13(12):1050-1054.
--   http://doi.org/10.1038/nmeth.4035
--
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
--   Chin CS, et al.
--   Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
--   Nat Methods. 2013 Jun;10(6):563-9
--   http://doi.org/10.1038/nmeth.2474
--
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '10.0.2' (from '/lustre/project/taw/share/conda-envs/ONRviral/bin/java') without -d64 support.
-- Detected gnuplot version '5.4 patchlevel 3   ' (from 'gnuplot') and image format 'png'.
--
-- Detected 20 CPUs and 64000 gigabytes of memory on the local machine.
--
-- Detected Slurm with 'sinfo' binary in /cm/shared/apps/slurm/14.03.0/bin/sinfo.
--          Slurm disabled by useGrid=false
--
-- Local machine mode enabled; grid support not detected or not allowed.
--
--                                (tag)Concurrency
--                         (tag)Threads          |
--                (tag)Memory         |          |
--        (tag)             |         |          |       total usage      algorithm
--        -------  ----------  --------   --------  --------------------  -----------------------------
-- Local: meryl     12.000 GB    4 CPUs x   5 jobs    60.000 GB  20 CPUs  (k-mer counting)
-- Local: hap        8.000 GB    4 CPUs x   5 jobs    40.000 GB  20 CPUs  (read-to-haplotype assignment)
-- Local: cormhap    6.000 GB   10 CPUs x   2 jobs    12.000 GB  20 CPUs  (overlap detection with mhap)
-- Local: obtovl     4.000 GB    5 CPUs x   4 jobs    16.000 GB  20 CPUs  (overlap detection)
-- Local: utgovl     4.000 GB    5 CPUs x   4 jobs    16.000 GB  20 CPUs  (overlap detection)
-- Local: cor        -.--- GB    4 CPUs x   - jobs     -.--- GB   - CPUs  (read correction)
-- Local: ovb        4.000 GB    1 CPU  x  20 jobs    80.000 GB  20 CPUs  (overlap store bucketizer)
-- Local: ovs        8.000 GB    1 CPU  x  20 jobs   160.000 GB  20 CPUs  (overlap store sorting)
-- Local: red       32.000 GB    4 CPUs x   5 jobs   160.000 GB  20 CPUs  (read error detection)
-- Local: oea       32.000 GB    1 CPU  x  20 jobs   640.000 GB  20 CPUs  (overlap error adjustment)
-- Local: bat       32.000 GB    4 CPUs x   1 job     32.000 GB   4 CPUs  (contig construction with bogart)
-- Local: cns        -.--- GB    4 CPUs x   - jobs     -.--- GB   - CPUs  (consensus)
--
-- Found Nanopore reads in 'barcode03.seqStore':
--   Libraries:
--     Nanopore:              1
--   Reads:
--     Raw:                   1338011
--     Corrected:             112664
--
--
-- Generating assembly 'barcode03' in '/lustre/project/taw/kvigil/ONR/baratariabay/ONR_baratariabay100623/20231006_1648_MN18851_FAW76720_acec0fdf/fastq_pass/concatenate/canu/barcode03':
--   genomeSize:
--     2000000
--
--   Overlap Generation Limits:
--     corOvlErrorRate 0.3200 ( 32.00%)
--     obtOvlErrorRate 0.2000 ( 20.00%)
--     utgOvlErrorRate 0.2000 ( 20.00%)
--
--   Overlap Processing Limits:
--     corErrorRate    0.3000 ( 30.00%)
--     obtErrorRate    0.2000 ( 20.00%)
--     utgErrorRate    0.2000 ( 20.00%)
--     cnsErrorRate    0.2000 ( 20.00%)
--
--   Stages to run:
--     trim corrected reads.
--     assemble corrected and trimmed reads.
--
--
-- Correction skipped; not enabled.
--
-- BEGIN TRIMMING
--
-- Creating overlap store trimming/barcode03.ovlStore using:
--      1 bucket
--     20 slices
--        using at most 1 GB memory each
-- Finished stage 'obt-overlapStoreConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'ovB' concurrent execution on Fri Oct 20 09:30:20 2023 with 148196.578 GB free disk space (1 processes; 20 concurrently)

    cd trimming/barcode03.ovlStore.BUILDING
    ./scripts/1-bucketize.sh 1 > ./logs/1-bucketize.000001.out 2>&1

-- Finished on Fri Oct 20 09:30:20 2023 (fast as lightning) with 148196.578 GB free disk space
----------------------------------------
--
-- Overlap store bucketizer jobs failed, retry.
--   job trimming/barcode03.ovlStore.BUILDING/bucket0001 FAILED.
--
--
-- Running jobs.  Second attempt out of 2.
----------------------------------------
-- Starting 'ovB' concurrent execution on Fri Oct 20 09:30:20 2023 with 148196.578 GB free disk space (1 processes; 20 concurrently)

    cd trimming/barcode03.ovlStore.BUILDING
    ./scripts/1-bucketize.sh 1 > ./logs/1-bucketize.000001.out 2>&1

-- Finished on Fri Oct 20 09:30:21 2023 (one second) with 148196.578 GB free disk space
----------------------------------------
--
-- Overlap store bucketizer jobs failed, tried 2 times, giving up.
--   job trimming/barcode03.ovlStore.BUILDING/bucket0001 FAILED.
--

ABORT:
ABORT: canu 2.2
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting.  If that doesn't work, ask for help.
ABORT:
skoren commented 11 months ago

Yes, you ended up with no reads:

--  segments   memory batches
--  -------- -------- -------
--
--  For 0 reads with 0 bases, limit to 0 batches.
--  Will count kmers using  jobs, each using  GB and 4 threads.

all due to trimming:

--    1023 reads        87528 bases (reads with no overlaps, deleted)
--      20 reads        25136 bases (reads with short trimmed length, deleted)

Based on the k-mer spectrum of the corrected reads, these reads don't seem to have shared k-mers. Have you confirmed they do indeed have any overlaps by mapping them to each other? You could also try assembling the corrected reads while skipping the trimming. It's also possible, if your target sequence is small enough, that a single corrected read would be sufficient and you could use that instead of running the assembly.

katievigil commented 11 months ago

Hi, I have never mapped my reads against eachother I have only mapped contigs back to my reads, how do you recommend doing this? minimap2? How can I skip the trimming? Thanks for your response!

skoren commented 11 months ago

Do you have a reference you can map to? If yes map the reads to that and look if they tile across w/some overlaps. If no, the best option is to run something like minimap2 in overlapping mode and see if it is finding overlaps between the reads. To run without trimming see the quick start: https://canu.readthedocs.io/en/latest/quick-start.html#correct-trim-and-assemble-manually which shows how to run individual pipeline steps. Just provide the corrected reads as input to assembly.

katievigil commented 11 months ago

Hi I do not have a reference, because these are metagenomic shot gun viral nanopore sequences. Looks like it failed again.

$ canu -p barcode03 -d /lustre/project/taw/kvigil/ONR/baratariabay/ONR_baratariabay100623/20231006_1648_MN18851_FAW76720_acec0fdf/fastq_pass/concatenate/canu/barcode03/ genomeSize=1m -untrimmed correctedErrorRate=0.12 maxInputCoverage=100 stopOnLowCoverage=0 'batOptions=-eg 0.10 -sb 0.01 -dg 2 -db 1 -dr 3' useGrid=false -nanopore /lustre/project/taw/kvigil/ONR/baratariabay/ONR_baratariabay100623/20231006_1648_MN18851_FAW76720_acec0fdf/fastq_pass/concatenate/canu/barcode03/barcode03.correctedReads.fasta.gz

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
        LANGUAGE = (unset),
        LC_ALL = (unset),
        LANG = "C.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
-- canu 2.2
--
-- CITATIONS
--
-- For 'standard' assemblies of PacBio or Nanopore reads:
--   Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM.
--   Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.
--   Genome Res. 2017 May;27(5):722-736.
--   http://doi.org/10.1101/gr.215087.116
--
-- Read and contig alignments during correction and consensus use:
--   Šošic M, Šikic M.
--   Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
--   Bioinformatics. 2017 May 1;33(9):1394-1395.
--   http://doi.org/10.1093/bioinformatics/btw753
--
-- Overlaps are generated using:
--   Berlin K, et al.
--   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
--   Nat Biotechnol. 2015 Jun;33(6):623-30.
--   http://doi.org/10.1038/nbt.3238
--
--   Myers EW, et al.
--   A Whole-Genome Assembly of Drosophila.
--   Science. 2000 Mar 24;287(5461):2196-204.
--   http://doi.org/10.1126/science.287.5461.2196
--
-- Corrected read consensus sequences are generated using an algorithm derived from FALCON-sense:
--   Chin CS, et al.
--   Phased diploid genome assembly with single-molecule real-time sequencing.
--   Nat Methods. 2016 Dec;13(12):1050-1054.
--   http://doi.org/10.1038/nmeth.4035
--
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
--   Chin CS, et al.
--   Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
--   Nat Methods. 2013 Jun;10(6):563-9
--   http://doi.org/10.1038/nmeth.2474
--
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '10.0.2' (from '/lustre/project/taw/share/conda-envs/ONRviral/bin/java') without -d64 support.
-- Detected gnuplot version '5.4 patchlevel 3   ' (from 'gnuplot') and image format 'png'.
--
-- Detected 20 CPUs and 64000 gigabytes of memory on the local machine.
--
-- Detected Slurm with 'sinfo' binary in /cm/shared/apps/slurm/14.03.0/bin/sinfo.
--          Slurm disabled by useGrid=false
--
-- Local machine mode enabled; grid support not detected or not allowed.
--
--                                (tag)Concurrency
--                         (tag)Threads          |
--                (tag)Memory         |          |
--        (tag)             |         |          |       total usage      algorithm
--        -------  ----------  --------   --------  --------------------  -----------------------------
-- Local: meryl     12.000 GB    4 CPUs x   5 jobs    60.000 GB  20 CPUs  (k-mer counting)
-- Local: hap        8.000 GB    4 CPUs x   5 jobs    40.000 GB  20 CPUs  (read-to-haplotype assignment)
-- Local: cormhap    6.000 GB   10 CPUs x   2 jobs    12.000 GB  20 CPUs  (overlap detection with mhap)
-- Local: obtovl     4.000 GB    5 CPUs x   4 jobs    16.000 GB  20 CPUs  (overlap detection)
-- Local: utgovl     4.000 GB    5 CPUs x   4 jobs    16.000 GB  20 CPUs  (overlap detection)
-- Local: cor        -.--- GB    4 CPUs x   - jobs     -.--- GB   - CPUs  (read correction)
-- Local: ovb        4.000 GB    1 CPU  x  20 jobs    80.000 GB  20 CPUs  (overlap store bucketizer)
-- Local: ovs        8.000 GB    1 CPU  x  20 jobs   160.000 GB  20 CPUs  (overlap store sorting)
-- Local: red       16.000 GB    4 CPUs x   5 jobs    80.000 GB  20 CPUs  (read error detection)
-- Local: oea        8.000 GB    1 CPU  x  20 jobs   160.000 GB  20 CPUs  (overlap error adjustment)
-- Local: bat       16.000 GB    4 CPUs x   1 job     16.000 GB   4 CPUs  (contig construction with bogart)
-- Local: cns        -.--- GB    4 CPUs x   - jobs     -.--- GB   - CPUs  (consensus)
--
-- Found Nanopore reads in 'barcode03.seqStore':
--   Libraries:
--     Nanopore:              1
--   Reads:
--     Raw:                   1338011
--     Corrected:             112664
--
--
-- Generating assembly 'barcode03' in '/lustre/project/taw/kvigil/ONR/baratariabay/ONR_baratariabay100623/20231006_1648_MN18851_FAW76720_acec0fdf/fastq_pass/concatenate/canu/barcode03':
--   genomeSize:
--     1000000
--
--   Overlap Generation Limits:
--     corOvlErrorRate 0.3200 ( 32.00%)
--     obtOvlErrorRate 0.1200 ( 12.00%)
--     utgOvlErrorRate 0.1200 ( 12.00%)
--
--   Overlap Processing Limits:
--     corErrorRate    0.3000 ( 30.00%)
--     obtErrorRate    0.1200 ( 12.00%)
--     utgErrorRate    0.1200 ( 12.00%)
--     cnsErrorRate    0.1200 ( 12.00%)
--
--   Stages to run:
--     trim corrected reads.
--     assemble corrected and trimmed reads.
--
--
-- Correction skipped; not enabled.
--
-- BEGIN TRIMMING
--
-- Creating overlap store trimming/barcode03.ovlStore using:
--      1 bucket
--     20 slices
--        using at most 1 GB memory each
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'ovB' concurrent execution on Fri Oct 20 15:26:26 2023 with 147670.462 GB free disk space (1 processes; 20 concurrently)

    cd trimming/barcode03.ovlStore.BUILDING
    ./scripts/1-bucketize.sh 1 > ./logs/1-bucketize.000001.out 2>&1

-- Finished on Fri Oct 20 15:26:28 2023 (2 seconds) with 147670.348 GB free disk space
----------------------------------------
--
-- Overlap store bucketizer jobs failed, retry.
--   job trimming/barcode03.ovlStore.BUILDING/bucket0001 FAILED.
--
--
-- Running jobs.  Second attempt out of 2.
----------------------------------------
-- Starting 'ovB' concurrent execution on Fri Oct 20 15:26:28 2023 with 147670.348 GB free disk space (1 processes; 20 concurrently)

    cd trimming/barcode03.ovlStore.BUILDING
    ./scripts/1-bucketize.sh 1 > ./logs/1-bucketize.000001.out 2>&1

-- Finished on Fri Oct 20 15:26:28 2023 (furiously fast) with 147670.348 GB free disk space
----------------------------------------
--
-- Overlap store bucketizer jobs failed, tried 2 times, giving up.
--   job trimming/barcode03.ovlStore.BUILDING/bucket0001 FAILED.
--

ABORT:
ABORT: canu 2.2
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting.  If that doesn't work, ask for help.
ABORT:
skoren commented 11 months ago

Your command is wrong, that is the suggestion for nano pore-only assembly but it would still perform the trimming. You can try it but give it raw reads. Was referring to the example of running assembly:

canu \
 -p ecoli -d ecoli-erate-0.039 \
  genomeSize=4.8m \
  correctedErrorRate=0.039 \
  -trimmed -corrected -pacbio ecoli/ecoli.trimmedReads.fasta.gz

so in your case you would want to use the -trimmed -corrected -nanopore options instead. You can keep using your genome size/etc. In all cases though, do not reuse the same -d folder for multiple experiments (like above). Use a new clean -d folder.

skoren commented 10 months ago

Idle