marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
654 stars 179 forks source link

No contigs with HiFi and HiCanu #2263

Closed emilyjunkins closed 11 months ago

emilyjunkins commented 11 months ago

Hello, I have HiFi metagenomic reads from a low complexity sample. I first tried running canu but very few reads were retained, I ran again following suggestions from this issue and was able to get all reads read in but now I have no contigs.

I am running canu 2.2 on Linux HPC system. I have also tried without the -untrimmed option get get the same results.

canu -p LS01_001 -d /home/ejunkins/LS01_001_comparemethods/canu_out5 genomeSize=5m gridEngineArrayMaxJobs=100 maxInputCoverage=100000 -untrimmed -pacbio-hifi /home/ejunkins/LS01_001_comparemethods/pbio-2668.26240.bc1021_BAK8B_OA--bc1021_BAK8B_OA.ccs.filter.fastq_copy.gz

This is the report for the above run:

[TRIMMING/READS]
--
-- In sequence store './LS01_001.seqStore':
--   Found 2543879 reads.
--   Found 15775775005 bases (3155.15 times coverage).
--    Histogram of corrected reads:
--    
--    G=15775775005                      sum of  ||               length     num
--    NG         length     index       lengths  ||                range    seqs
--    ----- ------------ --------- ------------  ||  ------------------- -------
--    00010        10498    128781   1577577916  ||       1325-1789           87|-
--    00020         8849    293555   3155160539  ||       1790-2254          161|-
--    00030         7783    484202   4732734867  ||       2255-2719          223|-
--    00040         6984    698557   6310312514  ||       2720-3184          263|-
--    00050         6332    935979   7887890883  ||       3185-3649         6751|--
--    00060         5756   1197385   9465467071  ||       3650-4114       253135|---------------------------------------------
--    00070         5233   1484923  11043043065  ||       4115-4579       362190|---------------------------------------------------------------
--    00080         4746   1801473  12620624149  ||       4580-5044       318856|--------------------------------------------------------
--    00090         4277   2151494  14198198850  ||       5045-5509       275841|------------------------------------------------
--    00100         1325   2543878  15775775005  ||       5510-5974       234006|-----------------------------------------
--    001.000x             2543879  15775775005  ||       5975-6439       199751|-----------------------------------
--                                               ||       6440-6904       168310|------------------------------
--                                               ||       6905-7369       138944|-------------------------
--                                               ||       7370-7834       112460|--------------------
--                                               ||       7835-8299        92302|-----------------
--                                               ||       8300-8764        74928|--------------
--                                               ||       8765-9229        61494|-----------
--                                               ||       9230-9694        49711|---------
--                                               ||       9695-10159       40826|--------
--                                               ||      10160-10624       33107|------
--                                               ||      10625-11089       26429|-----
--                                               ||      11090-11554       20851|----
--                                               ||      11555-12019       16559|---
--                                               ||      12020-12484       13363|---
--                                               ||      12485-12949       10344|--
--                                               ||      12950-13414        8207|--
--                                               ||      13415-13879        6195|--
--                                               ||      13880-14344        4795|-
--                                               ||      14345-14809        3600|-
--                                               ||      14810-15274        2685|-
--                                               ||      15275-15739        2026|-
--                                               ||      15740-16204        1551|-
--                                               ||      16205-16669        1144|-
--                                               ||      16670-17134         781|-
--                                               ||      17135-17599         599|-
--                                               ||      17600-18064         467|-
--                                               ||      18065-18529         314|-
--                                               ||      18530-18994         185|-
--                                               ||      18995-19459         152|-
--                                               ||      19460-19924         105|-
--                                               ||      19925-20389          66|-
--                                               ||      20390-20854          49|-
--                                               ||      20855-21319          28|-
--                                               ||      21320-21784          13|-
--                                               ||      21785-22249          12|-
--                                               ||      22250-22714           8|-
--                                               ||      22715-23179           3|-
--                                               ||      23180-23644           1|-
--                                               ||      23645-24109           0|
--                                               ||      24110-24574           1|-
--

[TRIMMING/MERS]
--
--  22-mers                                                                                           Fraction
--    Occurrences   NumMers                                                                         Unique Total
--       1-     1         0                                                                        0.0000 0.0000
--       2-     2 386913834 ********************************************************************** 0.4276 0.0771
--       3-     7 376482552 ********************************************************************   0.6139 0.1274
--       8-    16  83264287 ***************                                                        0.8648 0.2459
--      17-    29  20412775 ***                                                                    0.9391 0.3245
--      30-    46   9941945 *                                                                      0.9590 0.3645
--      47-    67   6078892 *                                                                      0.9697 0.4017
--      68-    92   3992616                                                                        0.9763 0.4359
--      93-   121   3393714                                                                        0.9805 0.4653
--     122-   154   4526761                                                                        0.9843 0.5038
--     155-   191    495318                                                                        0.9892 0.5628
--     192-   232    199281                                                                        0.9897 0.5705
--     233-   277    682960                                                                        0.9899 0.5747
--     278-   326   1902755                                                                        0.9907 0.5933
--     327-   379   1364404                                                                        0.9928 0.6504
--     380-   436   1543101                                                                        0.9943 0.6980
--     437-   497   2825750                                                                        0.9960 0.7624
--     498-   562    365296                                                                        0.9991 0.8921
--     563-   631     60892                                                                        0.9995 0.9098
--     632-   704     38268                                                                        0.9996 0.9134
--     705-   781     37126                                                                        0.9996 0.9159
--     782-   862     49369                                                                        0.9996 0.9187
--     863-   947     48639                                                                        0.9997 0.9228
--     948-  1036     20167                                                                        0.9998 0.9271
--    1037-  1129     13975                                                                        0.9998 0.9290
--    1130-  1226     12953                                                                        0.9998 0.9306
--    1227-  1327     14918                                                                        0.9998 0.9321
--    1328-  1432     10563                                                                        0.9998 0.9340
--    1433-  1541      8414                                                                        0.9998 0.9354
--    1542-  1654      4439                                                                        0.9998 0.9367
--    1655-  1771      7415                                                                        0.9998 0.9374
--    1772-  1892      6160                                                                        0.9999 0.9386
--    1893-  2017      5893                                                                        0.9999 0.9397
--    2018-  2146      7953                                                                        0.9999 0.9409
--    2147-  2279      5025                                                                        0.9999 0.9425
--    2280-  2416      4032                                                                        0.9999 0.9437
--    2417-  2557      4772                                                                        0.9999 0.9446
--    2558-  2702      3419                                                                        0.9999 0.9458
--    2703-  2851      5382                                                                        0.9999 0.9467
--    2852-  3004      4859                                                                        0.9999 0.9482
--    3005-  3161      3349                                                                        0.9999 0.9496
--
--           0 (max occurrences)
-- 10039868242 (total mers, non-unique)
--   904843411 (distinct mers, non-unique)
--           0 (unique mers)
skoren commented 11 months ago

What's the full report (or is that it you posted above)? What's the contents of the assembly folder and the canu.out file there?

emilyjunkins commented 11 months ago

That is the full report.

canu.out:

Found perl:
   /usr/bin/perl
   This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi

Found java:
   /usr/bin/java
   openjdk version "1.8.0_171"

Found canu:
   /home/ejunkins/canu-2.2/bin/canu
   canu 2.2

-- canu 2.2
--
-- CITATIONS
--
-- For assemblies of PacBio HiFi reads:
--   Nurk S, Walenz BP, Rhiea A, Vollger MR, Logsdon GA, Grothe R, Miga KH, Eichler EE, Phillippy AM, Koren S.
--   HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.
--   biorXiv. 2020.
--   https://doi.org/10.1101/2020.03.14.992248
-- 
-- Read and contig alignments during correction and consensus use:
--   Šošic M, Šikic M.
--   Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
--   Bioinformatics. 2017 May 1;33(9):1394-1395.
--   http://doi.org/10.1093/bioinformatics/btw753
-- 
-- Overlaps are generated using:
--   Berlin K, et al.
--   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
--   Nat Biotechnol. 2015 Jun;33(6):623-30.
--   http://doi.org/10.1038/nbt.3238
-- 
--   Myers EW, et al.
--   A Whole-Genome Assembly of Drosophila.
--   Science. 2000 Mar 24;287(5461):2196-204.
--   http://doi.org/10.1126/science.287.5461.2196
-- 
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
--   Chin CS, et al.
--   Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
--   Nat Methods. 2013 Jun;10(6):563-9
--   http://doi.org/10.1038/nmeth.2474
-- 
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '1.8.0_171' (from 'java') with -d64 support.
--
-- WARNING:
-- WARNING:  Failed to run gnuplot using command 'gnuplot'.
-- WARNING:  Plots will be disabled.
-- WARNING:
--
--
-- Detected 1 CPUs and 1410000 gigabytes of memory on the local machine.
--
-- Detected Slurm with 'sinfo' binary in /usr/bin/sinfo.
-- Detected Slurm with task IDs up to 1000 allowed.
-- 
-- Slurm support detected.  Resources available:
--     11 hosts with  24 cores and  187 GB memory.
--      4 hosts with  40 cores and 1376 GB memory.
--     67 hosts with  40 cores and  187 GB memory.
--
--                         (tag)Threads
--                (tag)Memory         |
--        (tag)             |         |  algorithm
--        -------  ----------  --------  -----------------------------
-- Grid:  meryl     12.000 GB    4 CPUs  (k-mer counting)
-- Grid:  hap        8.000 GB    4 CPUs  (read-to-haplotype assignment)
-- Grid:  cormhap    6.000 GB    8 CPUs  (overlap detection with mhap)
-- Grid:  obtovl     4.000 GB    8 CPUs  (overlap detection)
-- Grid:  utgovl     4.000 GB    8 CPUs  (overlap detection)
-- Grid:  cor        -.--- GB    4 CPUs  (read correction)
-- Grid:  ovb        4.000 GB    1 CPU   (overlap store bucketizer)
-- Grid:  ovs        8.000 GB    1 CPU   (overlap store sorting)
-- Grid:  red       16.000 GB    4 CPUs  (read error detection)
-- Grid:  oea        8.000 GB    1 CPU   (overlap error adjustment)
-- Grid:  bat       16.000 GB    4 CPUs  (contig construction with bogart)
-- Grid:  cns        -.--- GB    4 CPUs  (consensus)
--
-- Found PacBio HiFi reads in 'LS01_001.seqStore':
--   Libraries:
--     PacBio HiFi:           1
--   Reads:
--     Corrected:             15775775005
--
--
-- Generating assembly 'LS01_001' in '/home/ejunkins/LS01_001_comparemethods/canu_out5':
--   genomeSize:
--     5000000
--
--   Overlap Generation Limits:
--     corOvlErrorRate 0.0000 (  0.00%)
--     obtOvlErrorRate 0.0250 (  2.50%)
--     utgOvlErrorRate 0.0100 (  1.00%)
--
--   Overlap Processing Limits:
--     corErrorRate    0.0000 (  0.00%)
--     obtErrorRate    0.0250 (  2.50%)
--     utgErrorRate    0.0003 (  0.03%)
--     cnsErrorRate    0.0500 (  5.00%)
--
--   Stages to run:
--     trim corrected reads.
--     assemble corrected and trimmed reads.
--
--
-- Correction skipped; not enabled.
--
-- BEGIN TRIMMING
-- Meryl finished successfully.  Kmer frequency histogram:
--
-- WARNING: gnuplot failed.
--
----------------------------------------
--
--  22-mers                                                                                           Fraction
--    Occurrences   NumMers                                                                         Unique Total
--       1-     1         0                                                                        0.0000 0.0000
--       2-     2 386913834 ********************************************************************** 0.4276 0.0771
--       3-     7 376482552 ********************************************************************   0.6139 0.1274
--       8-    16  83264287 ***************                                                        0.8648 0.2459
--      17-    29  20412775 ***                                                                    0.9391 0.3245
--      30-    46   9941945 *                                                                      0.9590 0.3645
--      47-    67   6078892 *                                                                      0.9697 0.4017
--      68-    92   3992616                                                                        0.9763 0.4359
--      93-   121   3393714                                                                        0.9805 0.4653
--     122-   154   4526761                                                                        0.9843 0.5038
--     155-   191    495318                                                                        0.9892 0.5628
--     192-   232    199281                                                                        0.9897 0.5705
--     233-   277    682960                                                                        0.9899 0.5747
--     278-   326   1902755                                                                        0.9907 0.5933
--     327-   379   1364404                                                                        0.9928 0.6504
--     380-   436   1543101                                                                        0.9943 0.6980
--     437-   497   2825750                                                                        0.9960 0.7624
--     498-   562    365296                                                                        0.9991 0.8921
--     563-   631     60892                                                                        0.9995 0.9098
--     632-   704     38268                                                                        0.9996 0.9134
--     705-   781     37126                                                                        0.9996 0.9159
--     782-   862     49369                                                                        0.9996 0.9187
--     863-   947     48639                                                                        0.9997 0.9228
--     948-  1036     20167                                                                        0.9998 0.9271
--    1037-  1129     13975                                                                        0.9998 0.9290
--    1130-  1226     12953                                                                        0.9998 0.9306
--    1227-  1327     14918                                                                        0.9998 0.9321
--    1328-  1432     10563                                                                        0.9998 0.9340
--    1433-  1541      8414                                                                        0.9998 0.9354
--    1542-  1654      4439                                                                        0.9998 0.9367
--    1655-  1771      7415                                                                        0.9998 0.9374
--    1772-  1892      6160                                                                        0.9999 0.9386
--    1893-  2017      5893                                                                        0.9999 0.9397
--    2018-  2146      7953                                                                        0.9999 0.9409
--    2147-  2279      5025                                                                        0.9999 0.9425
--    2280-  2416      4032                                                                        0.9999 0.9437
--    2417-  2557      4772                                                                        0.9999 0.9446
--    2558-  2702      3419                                                                        0.9999 0.9458
--    2703-  2851      5382                                                                        0.9999 0.9467
--    2852-  3004      4859                                                                        0.9999 0.9482
--    3005-  3161      3349                                                                        0.9999 0.9496
--
--           0 (max occurrences)
-- 10039868242 (total mers, non-unique)
--   904843411 (distinct mers, non-unique)
--           0 (unique mers)
-- Finished stage 'meryl-process', reset canuIteration.
--
-- Removing meryl database 'trimming/0-mercounts/LS01_001.ms22'.
--
-- OVERLAPPER (normal) (trimming) erate=0.025
--
----------------------------------------
-- Starting command on Wed Sep 27 13:53:52 2023 with 69471.029 GB free disk space

    cd trimming/1-overlapper
    /home/ejunkins/canu-2.2/bin/overlapInCorePartition \
     -S  ../../LS01_001.seqStore \
     -hl 80000000 \
     -rl 1000000000 \
     -ol 500 \
     -o  ./LS01_001.partition \
    > ./LS01_001.partition.err 2>&1

-- Finished on Wed Sep 27 13:53:52 2023 (lickety-split) with 69471.029 GB free disk space
----------------------------------------
--
-- Configured 918 overlapInCore jobs.
-- Finished stage 'obt-overlapConfigure', reset canuIteration.
--
-- Running jobs.  First attempt out of 2.
--
-- 'overlap.jobSubmit-01.sh' -> job 3239876 tasks 1-100.
-- 'overlap.jobSubmit-02.sh' -> job 3239877 tasks 101-200.
-- Failed to submit compute jobs.  Delay 10 seconds and try again.

CRASH:
CRASH: canu 2.2
CRASH: Please panic, this is abnormal.
CRASH:
CRASH:   Failed to submit compute jobs.
CRASH:
CRASH: Failed at /home/ejunkins/canu-2.2/bin/../lib/site_perl/canu/Execution.pm line 1259.
CRASH:  canu::Execution::submitOrRunParallelJob('LS01_001', 'obtovl', 'trimming/1-overlapper', 'overlap', 1, 2, 3, 4, 5, ...) called at /home/ejunkins/canu-2.2/bin/../lib/site_perl/canu/OverlapInCore.pm line 394
CRASH:  canu::OverlapInCore::overlapCheck('LS01_001', 'obt', 'partial') called at /home/ejunkins/canu-2.2/bin/canu line 1012
CRASH:  main::overlap('LS01_001', 'obt') called at /home/ejunkins/canu-2.2/bin/canu line 1103
CRASH: 
CRASH: Last 50 lines of the relevant log file (trimming/1-overlapper/overlap.jobSubmit-03.out):
CRASH:
CRASH: sbatch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)
CRASH:

Seems like a cluster issue but previous use of the gridEngine flag seemed to work.

This is my output in the assembly folder: canu-logs canu.out canu-scripts LS01_001.report LS01_001.seqStore LS01_001.seqStore.err LS01_001.seqStore.sh trimming

skoren commented 11 months ago

This looks like an issue with your grid system not allowing the job submission. I'm guessing there's a limit on the number of total jobs you're allowed to submit, see issues #1752 and #1883 for more details on these limits. If this limit cannot be increased, you'll have to follow the suggestion in #1883 and use useGrid=remote and manually run the maximum jobs you're allowed at a time.

skoren commented 11 months ago

Idle, incompatible grid setup.