marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
660 stars 179 forks source link

No contigs with HiFi and HiCanu #2263

Closed emilyjunkins closed 1 year ago

emilyjunkins commented 1 year ago

Hello, I have HiFi metagenomic reads from a low complexity sample. I first tried running canu but very few reads were retained, I ran again following suggestions from this issue and was able to get all reads read in but now I have no contigs.

I am running canu 2.2 on Linux HPC system. I have also tried without the -untrimmed option get get the same results.

canu -p LS01_001 -d /home/ejunkins/LS01_001_comparemethods/canu_out5 genomeSize=5m gridEngineArrayMaxJobs=100 maxInputCoverage=100000 -untrimmed -pacbio-hifi /home/ejunkins/LS01_001_comparemethods/pbio-2668.26240.bc1021_BAK8B_OA--bc1021_BAK8B_OA.ccs.filter.fastq_copy.gz

This is the report for the above run:

[TRIMMING/READS]
--
-- In sequence store './LS01_001.seqStore':
--   Found 2543879 reads.
--   Found 15775775005 bases (3155.15 times coverage).
--    Histogram of corrected reads:
--    
--    G=15775775005                      sum of  ||               length     num
--    NG         length     index       lengths  ||                range    seqs
--    ----- ------------ --------- ------------  ||  ------------------- -------
--    00010        10498    128781   1577577916  ||       1325-1789           87|-
--    00020         8849    293555   3155160539  ||       1790-2254          161|-
--    00030         7783    484202   4732734867  ||       2255-2719          223|-
--    00040         6984    698557   6310312514  ||       2720-3184          263|-
--    00050         6332    935979   7887890883  ||       3185-3649         6751|--
--    00060         5756   1197385   9465467071  ||       3650-4114       253135|---------------------------------------------
--    00070         5233   1484923  11043043065  ||       4115-4579       362190|---------------------------------------------------------------
--    00080         4746   1801473  12620624149  ||       4580-5044       318856|--------------------------------------------------------
--    00090         4277   2151494  14198198850  ||       5045-5509       275841|------------------------------------------------
--    00100         1325   2543878  15775775005  ||       5510-5974       234006|-----------------------------------------
--    001.000x             2543879  15775775005  ||       5975-6439       199751|-----------------------------------
--                                               ||       6440-6904       168310|------------------------------
--                                               ||       6905-7369       138944|-------------------------
--                                               ||       7370-7834       112460|--------------------
--                                               ||       7835-8299        92302|-----------------
--                                               ||       8300-8764        74928|--------------
--                                               ||       8765-9229        61494|-----------
--                                               ||       9230-9694        49711|---------
--                                               ||       9695-10159       40826|--------
--                                               ||      10160-10624       33107|------
--                                               ||      10625-11089       26429|-----
--                                               ||      11090-11554       20851|----
--                                               ||      11555-12019       16559|---
--                                               ||      12020-12484       13363|---
--                                               ||      12485-12949       10344|--
--                                               ||      12950-13414        8207|--
--                                               ||      13415-13879        6195|--
--                                               ||      13880-14344        4795|-
--                                               ||      14345-14809        3600|-
--                                               ||      14810-15274        2685|-
--                                               ||      15275-15739        2026|-
--                                               ||      15740-16204        1551|-
--                                               ||      16205-16669        1144|-
--                                               ||      16670-17134         781|-
--                                               ||      17135-17599         599|-
--                                               ||      17600-18064         467|-
--                                               ||      18065-18529         314|-
--                                               ||      18530-18994         185|-
--                                               ||      18995-19459         152|-
--                                               ||      19460-19924         105|-
--                                               ||      19925-20389          66|-
--                                               ||      20390-20854          49|-
--                                               ||      20855-21319          28|-
--                                               ||      21320-21784          13|-
--                                               ||      21785-22249          12|-
--                                               ||      22250-22714           8|-
--                                               ||      22715-23179           3|-
--                                               ||      23180-23644           1|-
--                                               ||      23645-24109           0|
--                                               ||      24110-24574           1|-
--

[TRIMMING/MERS]
--
--  22-mers                                                                                           Fraction
--    Occurrences   NumMers                                                                         Unique Total
--       1-     1         0                                                                        0.0000 0.0000
--       2-     2 386913834 ********************************************************************** 0.4276 0.0771
--       3-     7 376482552 ********************************************************************   0.6139 0.1274
--       8-    16  83264287 ***************                                                        0.8648 0.2459
--      17-    29  20412775 ***                                                                    0.9391 0.3245
--      30-    46   9941945 *                                                                      0.9590 0.3645
--      47-    67   6078892 *                                                                      0.9697 0.4017
--      68-    92   3992616                                                                        0.9763 0.4359
--      93-   121   3393714                                                                        0.9805 0.4653
--     122-   154   4526761                                                                        0.9843 0.5038
--     155-   191    495318                                                                        0.9892 0.5628
--     192-   232    199281                                                                        0.9897 0.5705
--     233-   277    682960                                                                        0.9899 0.5747
--     278-   326   1902755                                                                        0.9907 0.5933
--     327-   379   1364404                                                                        0.9928 0.6504
--     380-   436   1543101                                                                        0.9943 0.6980
--     437-   497   2825750                                                                        0.9960 0.7624
--     498-   562    365296                                                                        0.9991 0.8921
--     563-   631     60892                                                                        0.9995 0.9098
--     632-   704     38268                                                                        0.9996 0.9134
--     705-   781     37126                                                                        0.9996 0.9159
--     782-   862     49369                                                                        0.9996 0.9187
--     863-   947     48639                                                                        0.9997 0.9228
--     948-  1036     20167                                                                        0.9998 0.9271
--    1037-  1129     13975                                                                        0.9998 0.9290
--    1130-  1226     12953                                                                        0.9998 0.9306
--    1227-  1327     14918                                                                        0.9998 0.9321
--    1328-  1432     10563                                                                        0.9998 0.9340
--    1433-  1541      8414                                                                        0.9998 0.9354
--    1542-  1654      4439                                                                        0.9998 0.9367
--    1655-  1771      7415                                                                        0.9998 0.9374
--    1772-  1892      6160                                                                        0.9999 0.9386
--    1893-  2017      5893                                                                        0.9999 0.9397
--    2018-  2146      7953                                                                        0.9999 0.9409
--    2147-  2279      5025                                                                        0.9999 0.9425
--    2280-  2416      4032                                                                        0.9999 0.9437
--    2417-  2557      4772                                                                        0.9999 0.9446
--    2558-  2702      3419                                                                        0.9999 0.9458
--    2703-  2851      5382                                                                        0.9999 0.9467
--    2852-  3004      4859                                                                        0.9999 0.9482
--    3005-  3161      3349                                                                        0.9999 0.9496
--
--           0 (max occurrences)
-- 10039868242 (total mers, non-unique)
--   904843411 (distinct mers, non-unique)
--           0 (unique mers)
skoren commented 1 year ago

What's the full report (or is that it you posted above)? What's the contents of the assembly folder and the canu.out file there?

emilyjunkins commented 1 year ago

That is the full report.

canu.out:

Found perl:
   /usr/bin/perl
   This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi

Found java:
   /usr/bin/java
   openjdk version "1.8.0_171"

Found canu:
   /home/ejunkins/canu-2.2/bin/canu
   canu 2.2

-- canu 2.2
--
-- CITATIONS
--
-- For assemblies of PacBio HiFi reads:
--   Nurk S, Walenz BP, Rhiea A, Vollger MR, Logsdon GA, Grothe R, Miga KH, Eichler EE, Phillippy AM, Koren S.
--   HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.
--   biorXiv. 2020.
--   https://doi.org/10.1101/2020.03.14.992248
-- 
-- Read and contig alignments during correction and consensus use:
--   Šošic M, Šikic M.
--   Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
--   Bioinformatics. 2017 May 1;33(9):1394-1395.
--   http://doi.org/10.1093/bioinformatics/btw753
-- 
-- Overlaps are generated using:
--   Berlin K, et al.
--   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
--   Nat Biotechnol. 2015 Jun;33(6):623-30.
--   http://doi.org/10.1038/nbt.3238
-- 
--   Myers EW, et al.
--   A Whole-Genome Assembly of Drosophila.
--   Science. 2000 Mar 24;287(5461):2196-204.
--   http://doi.org/10.1126/science.287.5461.2196
-- 
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
--   Chin CS, et al.
--   Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
--   Nat Methods. 2013 Jun;10(6):563-9
--   http://doi.org/10.1038/nmeth.2474
-- 
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '1.8.0_171' (from 'java') with -d64 support.
--
-- WARNING:
-- WARNING:  Failed to run gnuplot using command 'gnuplot'.
-- WARNING:  Plots will be disabled.
-- WARNING:
--
--
-- Detected 1 CPUs and 1410000 gigabytes of memory on the local machine.
--
-- Detected Slurm with 'sinfo' binary in /usr/bin/sinfo.
-- Detected Slurm with task IDs up to 1000 allowed.
-- 
-- Slurm support detected.  Resources available:
--     11 hosts with  24 cores and  187 GB memory.
--      4 hosts with  40 cores and 1376 GB memory.
--     67 hosts with  40 cores and  187 GB memory.
--
--                         (tag)Threads
--                (tag)Memory         |
--        (tag)             |         |  algorithm
--        -------  ----------  --------  -----------------------------
-- Grid:  meryl     12.000 GB    4 CPUs  (k-mer counting)
-- Grid:  hap        8.000 GB    4 CPUs  (read-to-haplotype assignment)
-- Grid:  cormhap    6.000 GB    8 CPUs  (overlap detection with mhap)
-- Grid:  obtovl     4.000 GB    8 CPUs  (overlap detection)
-- Grid:  utgovl     4.000 GB    8 CPUs  (overlap detection)
-- Grid:  cor        -.--- GB    4 CPUs  (read correction)
-- Grid:  ovb        4.000 GB    1 CPU   (overlap store bucketizer)
-- Grid:  ovs        8.000 GB    1 CPU   (overlap store sorting)
-- Grid:  red       16.000 GB    4 CPUs  (read error detection)
-- Grid:  oea        8.000 GB    1 CPU   (overlap error adjustment)
-- Grid:  bat       16.000 GB    4 CPUs  (contig construction with bogart)
-- Grid:  cns        -.--- GB    4 CPUs  (consensus)
--
-- Found PacBio HiFi reads in 'LS01_001.seqStore':
--   Libraries:
--     PacBio HiFi:           1
--   Reads:
--     Corrected:             15775775005
--
--
-- Generating assembly 'LS01_001' in '/home/ejunkins/LS01_001_comparemethods/canu_out5':
--   genomeSize:
--     5000000
--
--   Overlap Generation Limits:
--     corOvlErrorRate 0.0000 (  0.00%)
--     obtOvlErrorRate 0.0250 (  2.50%)
--     utgOvlErrorRate 0.0100 (  1.00%)
--
--   Overlap Processing Limits:
--     corErrorRate    0.0000 (  0.00%)
--     obtErrorRate    0.0250 (  2.50%)
--     utgErrorRate    0.0003 (  0.03%)
--     cnsErrorRate    0.0500 (  5.00%)
--
--   Stages to run:
--     trim corrected reads.
--     assemble corrected and trimmed reads.
--
--
-- Correction skipped; not enabled.
--
-- BEGIN TRIMMING
-- Meryl finished successfully.  Kmer frequency histogram:
--
-- WARNING: gnuplot failed.
--
----------------------------------------
--
--  22-mers                                                                                           Fraction
--    Occurrences   NumMers                                                                         Unique Total
--       1-     1         0                                                                        0.0000 0.0000
--       2-     2 386913834 ********************************************************************** 0.4276 0.0771
--       3-     7 376482552 ********************************************************************   0.6139 0.1274
--       8-    16  83264287 ***************                                                        0.8648 0.2459
--      17-    29  20412775 ***                                                                    0.9391 0.3245
--      30-    46   9941945 *                                                                      0.9590 0.3645
--      47-    67   6078892 *                                                                      0.9697 0.4017
--      68-    92   3992616                                                                        0.9763 0.4359
--      93-   121   3393714                                                                        0.9805 0.4653
--     122-   154   4526761                                                                        0.9843 0.5038
--     155-   191    495318                                                                        0.9892 0.5628
--     192-   232    199281                                                                        0.9897 0.5705
--     233-   277    682960                                                                        0.9899 0.5747
--     278-   326   1902755                                                                        0.9907 0.5933
--     327-   379   1364404                                                                        0.9928 0.6504
--     380-   436   1543101                                                                        0.9943 0.6980
--     437-   497   2825750                                                                        0.9960 0.7624
--     498-   562    365296                                                                        0.9991 0.8921
--     563-   631     60892                                                                        0.9995 0.9098
--     632-   704     38268                                                                        0.9996 0.9134
--     705-   781     37126                                                                        0.9996 0.9159
--     782-   862     49369                                                                        0.9996 0.9187
--     863-   947     48639                                                                        0.9997 0.9228
--     948-  1036     20167                                                                        0.9998 0.9271
--    1037-  1129     13975                                                                        0.9998 0.9290
--    1130-  1226     12953                                                                        0.9998 0.9306
--    1227-  1327     14918                                                                        0.9998 0.9321
--    1328-  1432     10563                                                                        0.9998 0.9340
--    1433-  1541      8414                                                                        0.9998 0.9354
--    1542-  1654      4439                                                                        0.9998 0.9367
--    1655-  1771      7415                                                                        0.9998 0.9374
--    1772-  1892      6160                                                                        0.9999 0.9386
--    1893-  2017      5893                                                                        0.9999 0.9397
--    2018-  2146      7953                                                                        0.9999 0.9409
--    2147-  2279      5025                                                                        0.9999 0.9425
--    2280-  2416      4032                                                                        0.9999 0.9437
--    2417-  2557      4772                                                                        0.9999 0.9446
--    2558-  2702      3419                                                                        0.9999 0.9458
--    2703-  2851      5382                                                                        0.9999 0.9467
--    2852-  3004      4859                                                                        0.9999 0.9482
--    3005-  3161      3349                                                                        0.9999 0.9496
--
--           0 (max occurrences)
-- 10039868242 (total mers, non-unique)
--   904843411 (distinct mers, non-unique)
--           0 (unique mers)
-- Finished stage 'meryl-process', reset canuIteration.
--
-- Removing meryl database 'trimming/0-mercounts/LS01_001.ms22'.
--
-- OVERLAPPER (normal) (trimming) erate=0.025
--
----------------------------------------
-- Starting command on Wed Sep 27 13:53:52 2023 with 69471.029 GB free disk space

    cd trimming/1-overlapper
    /home/ejunkins/canu-2.2/bin/overlapInCorePartition \
     -S  ../../LS01_001.seqStore \
     -hl 80000000 \
     -rl 1000000000 \
     -ol 500 \
     -o  ./LS01_001.partition \
    > ./LS01_001.partition.err 2>&1

-- Finished on Wed Sep 27 13:53:52 2023 (lickety-split) with 69471.029 GB free disk space
----------------------------------------
--
-- Configured 918 overlapInCore jobs.
-- Finished stage 'obt-overlapConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
--
-- 'overlap.jobSubmit-01.sh' -> job 3239876 tasks 1-100.
-- 'overlap.jobSubmit-02.sh' -> job 3239877 tasks 101-200.
-- Failed to submit compute jobs.  Delay 10 seconds and try again.

CRASH:
CRASH: canu 2.2
CRASH: Please panic, this is abnormal.
CRASH:
CRASH:   Failed to submit compute jobs.
CRASH:
CRASH: Failed at /home/ejunkins/canu-2.2/bin/../lib/site_perl/canu/Execution.pm line 1259.
CRASH:  canu::Execution::submitOrRunParallelJob('LS01_001', 'obtovl', 'trimming/1-overlapper', 'overlap', 1, 2, 3, 4, 5, ...) called at /home/ejunkins/canu-2.2/bin/../lib/site_perl/canu/OverlapInCore.pm line 394
CRASH:  canu::OverlapInCore::overlapCheck('LS01_001', 'obt', 'partial') called at /home/ejunkins/canu-2.2/bin/canu line 1012
CRASH:  main::overlap('LS01_001', 'obt') called at /home/ejunkins/canu-2.2/bin/canu line 1103
CRASH: 
CRASH: Last 50 lines of the relevant log file (trimming/1-overlapper/overlap.jobSubmit-03.out):
CRASH:
CRASH: sbatch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)
CRASH:

Seems like a cluster issue but previous use of the gridEngine flag seemed to work.

This is my output in the assembly folder: canu-logs canu.out canu-scripts LS01_001.report LS01_001.seqStore LS01_001.seqStore.err LS01_001.seqStore.sh trimming

skoren commented 1 year ago

This looks like an issue with your grid system not allowing the job submission. I'm guessing there's a limit on the number of total jobs you're allowed to submit, see issues #1752 and #1883 for more details on these limits. If this limit cannot be increased, you'll have to follow the suggestion in #1883 and use useGrid=remote and manually run the maximum jobs you're allowed at a time.

skoren commented 1 year ago

Idle, incompatible grid setup.