marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
655 stars 179 forks source link

Canu stopped #1828

Closed smf96 closed 3 years ago

smf96 commented 3 years ago

Hello,

I'm trying to assemble the reads from nanopore sequenced wgs for just a 200kb region and am using Canu release v2.0 running on my Universities HPC system.

This is the canu command I used:

 -p FCGR_reads -d assemble_FCGR_CRA125_Canu \
 genomeSize=200k \
 stopOnLowCoverage=1 \
 gnuplot=/local/software/biobuilds/2017.11/bin/gnuplot\
 java=/local/software/biobuilds/2017.11/bin/java\
 -nanopore 11_CRA125_FCGR_reads.fq 

I have reduced the threshold for stopping on low coverage as I've only done a few minION runs so far so know I need more coverage but just wanted to see what the output would be at the moment.

It has made the 'unitigging' directory but nothing more and they are no more scripts in the queue....

I have pasted the report below,,,

[CORRECTION/READS]
--
-- In sequence store './FCGR_reads.seqStore':
--   Found 170 reads.
--   Found 2394140 bases (11.97 times coverage).
--
--    G=2394140                          sum of  ||               length     num
--    NG         length     index       lengths  ||                range    seqs
--    ----- ------------ --------- ------------  ||  ------------------- -------
--    00010        61423         2       249155  ||       1018-3132           49|---------------------------------------------------------------
--    00020        47847         7       518805  ||       3133-5247           19|-------------------------
--    00030        42264        12       738321  ||       5248-7362           15|--------------------
--    00040        37161        18       979083  ||       7363-9477           13|-----------------
--    00050        31096        25      1221961  ||       9478-11592          10|-------------
--    00060        22989        33      1444997  ||      11593-13707          11|---------------
--    00070        16196        45      1679306  ||      13708-15822           6|--------
--    00080        11593        63      1917639  ||      15823-17937           3|----
--    00090         6609        89      2155776  ||      17938-20052           5|-------
--    00100         1018       169      2394140  ||      20053-22167           3|----
--    001.000x                 170      2394140  ||      22168-24282           3|----
--                                               ||      24283-26397           2|---
--                                               ||      26398-28512           1|--
--                                               ||      28513-30627           3|----
--                                               ||      30628-32742           2|---
--                                               ||      32743-34857           2|---
--                                               ||      34858-36972           4|------
--                                               ||      36973-39087           1|--
--                                               ||      39088-41202           4|------
--                                               ||      41203-43317           4|------
--                                               ||      43318-45432           1|--
--                                               ||      45433-47547           1|--
--                                               ||      47548-49662           1|--
--                                               ||      49663-51777           2|---
--                                               ||      51778-53892           0|
--                                               ||      53893-56007           0|
--                                               ||      56008-58122           0|
--                                               ||      58123-60237           1|--
--                                               ||      60238-62352           2|---
--                                               ||      62353-64467           0|
--                                               ||      64468-66582           0|
--                                               ||      66583-68697           0|
--                                               ||      68698-70812           0|
--                                               ||      70813-72927           0|
--                                               ||      72928-75042           0|
--                                               ||      75043-77157           0|
--                                               ||      77158-79272           0|
--                                               ||      79273-81387           1|--
--                                               ||      81388-83502           0|
--                                               ||      83503-85617           0|
--                                               ||      85618-87732           0|
--                                               ||      87733-89847           0|
--                                               ||      89848-91962           0|
--                                               ||      91963-94077           0|
--                                               ||      94078-96192           0|
--                                               ||      96193-98307           0|
--                                               ||      98308-100422          0|
--                                               ||     100423-102537          0|
--                                               ||     102538-104652          0|
--                                               ||     104653-106767          1|--
--

[CORRECTION/MERS]
--
--  16-mers                                                                                           Fraction
--    Occurrences   NumMers                                                                         Unique Total
--       1-     1         0                                                                        0.0000 0.0000
--       2-     2     58517 ********************************************************************** 0.2982 0.1172
--       3-     4     49463 ***********************************************************            0.4367 0.1988
--       5-     7     48923 **********************************************************             0.6484 0.3845
--       8-    11     30735 ************************************                                   0.8544 0.6625
--      12-    16      7180 ********                                                               0.9722 0.8963
--      17-    22       718                                                                        0.9940 0.9563
--      23-    29       207                                                                        0.9968 0.9671
--      30-    37       147                                                                        0.9977 0.9719
--      38-    46        88                                                                        0.9984 0.9765
--      47-    56        67                                                                        0.9988 0.9801
--      57-    67        61                                                                        0.9991 0.9833
--      68-    79        37                                                                        0.9994 0.9872
--      80-    92        23                                                                        0.9996 0.9897
--      93-   106        15                                                                        0.9997 0.9917
--     107-   121         4                                                                        0.9998 0.9932
--     122-   137         9                                                                        0.9998 0.9938
--     138-   154        12                                                                        0.9999 0.9950
--     155-   172         7                                                                        0.9999 0.9968
--     173-   191         2                                                                        1.0000 0.9978
--     192-   211         2                                                                        1.0000 0.9981
--     212-   232         0                                                                        0.0000 0.0000
--     233-   254         0                                                                        0.0000 0.0000
--     255-   277         0                                                                        0.0000 0.0000
--     278-   301         1                                                                        1.0000 0.9986
--     302-   326         1                                                                        1.0000 0.9989
--
--           0 (max occurrences)
--      998645 (total mers, non-unique)
--      196220 (distinct mers, non-unique)
--           0 (unique mers)

[CORRECTION/LAYOUT]
--                             original      original
--                            raw reads     raw reads
--   category                w/overlaps  w/o/overlaps
--   -------------------- ------------- -------------
--   Number of Reads                137             0
--   Number of Bases            1955012             0
--   Coverage                     9.775         0.000
--   Median                        7614             0
--   Mean                         14270             0
--   N50                          35428             0
--   Minimum                       1018             0
--   Maximum                     106749             0
--
--                                        --------corrected---------  ----------rescued----------
--                             evidence                     expected                     expected
--   category                     reads            raw     corrected            raw     corrected
--   -------------------- -------------  ------------- -------------  ------------- -------------
--   Number of Reads                137            136           136              0             0
--   Number of Bases            1955012        1848263       1833720              0             0
--   Coverage                     9.775          9.241         9.169          0.000         0.000
--   Median                        7614           7614          7257              0             0
--   Mean                         14270          13590         13483              0             0
--   N50                          35428          32773         32772              0             0
--   Minimum                       1018           1018           993              0             0
--   Maximum                     106749          80983         80822              0             0
--
--                        --------uncorrected--------
--                                           expected
--   category                       raw     corrected
--   -------------------- ------------- -------------
--   Number of Reads                  1             1
--   Number of Bases             106749         95696
--   Coverage                     0.534         0.478
--   Median                      106749         95696
--   Mean                        106749         95696
--   N50                            188            62
--   Minimum                     106749         95696
--   Maximum                     106749         95696
--
--   Maximum Memory          1494907040

[TRIMMING/READS]
--
-- In sequence store './FCGR_reads.seqStore':
--   Found 136 reads.
--   Found 1853223 bases (9.26 times coverage).
--
--    G=1853223                          sum of  ||               length     num
--    NG         length     index       lengths  ||                range    seqs
--    ----- ------------ --------- ------------  ||  ------------------- -------
--    00010        60607         2       203695  ||        997-2613           36|---------------------------------------------------------------
--    00020        46804         6       411116  ||       2614-4230           14|-------------------------
--    00030        41807        10       584030  ||       4231-5847           12|---------------------
--    00040        39673        14       747137  ||       5848-7464            6|-----------
--    00050        31151        20       954807  ||       7465-9081            8|--------------
--    00060        23079        26      1119028  ||       9082-10698           6|-----------
--    00070        15205        36      1309852  ||      10699-12315           8|--------------
--    00080        11117        50      1488904  ||      12316-13932           6|-----------
--    00090         6604        71      1672580  ||      13933-15549           4|-------
--    00100          997       135      1853223  ||      15550-17166           2|----
--    001.000x                 136      1853223  ||      17167-18783           3|------
--                                               ||      18784-20400           0|
--                                               ||      20401-22017           3|------
--                                               ||      22018-23634           2|----
--                                               ||      23635-25251           0|
--                                               ||      25252-26868           2|----
--                                               ||      26869-28485           1|--
--                                               ||      28486-30102           0|
--                                               ||      30103-31719           3|------
--                                               ||      31720-33336           1|--
--                                               ||      33337-34953           1|--
--                                               ||      34954-36570           2|----
--                                               ||      36571-38187           1|--
--                                               ||      38188-39804           1|--
--                                               ||      39805-41421           3|------
--                                               ||      41422-43038           2|----
--                                               ||      43039-44655           1|--
--                                               ||      44656-46272           1|--
--                                               ||      46273-47889           1|--
--                                               ||      47890-49506           0|
--                                               ||      49507-51123           2|----
--                                               ||      51124-52740           0|
--                                               ||      52741-54357           0|
--                                               ||      54358-55974           0|
--                                               ||      55975-57591           0|
--                                               ||      57592-59208           1|--
--                                               ||      59209-60825           1|--
--                                               ||      60826-62442           1|--
--                                               ||      62443-64059           0|
--                                               ||      64060-65676           0|
--                                               ||      65677-67293           0|
--                                               ||      67294-68910           0|
--                                               ||      68911-70527           0|
--                                               ||      70528-72144           0|
--                                               ||      72145-73761           0|
--                                               ||      73762-75378           0|
--                                               ||      75379-76995           0|
--                                               ||      76996-78612           0|
--                                               ||      78613-80229           0|
--                                               ||      80230-81846           1|--
--

[TRIMMING/MERS]
--
--  22-mers                                                                                           Fraction
--    Occurrences   NumMers                                                                         Unique Total
--       1-     1         0                                                                        0.0000 0.0000
--       2-     2     26680 ********************************************************               0.1492 0.0331
--       3-     4     29205 *************************************************************          0.2409 0.0636
--       5-     7     33114 *********************************************************************  0.3725 0.1285
--       8-    11     31313 *****************************************************************      0.5443 0.2587
--      12-    16     33246 ********************************************************************** 0.7015 0.4367
--      17-    22     22956 ************************************************                       0.8929 0.7568
--      23-    29      1897 ***                                                                    0.9899 0.9662
--      30-    37       171                                                                        0.9979 0.9893
--      38-    46        75                                                                        0.9989 0.9927
--      47-    56        70                                                                        0.9992 0.9945
--      57-    67        32                                                                        0.9996 0.9966
--      68-    79        25                                                                        0.9998 0.9978
--      80-    92         6                                                                        0.9999 0.9990
--      93-   106         1                                                                        1.0000 0.9992
--     107-   121         0                                                                        0.0000 0.0000
--     122-   137         5                                                                        1.0000 0.9993
--     138-   154         0                                                                        0.0000 0.0000
--     155-   172         0                                                                        0.0000 0.0000
--     173-   191         2                                                                        1.0000 0.9998
--     192-   211         1                                                                        1.0000 1.0000
--
--           0 (max occurrences)
--     1613346 (total mers, non-unique)
--      178799 (distinct mers, non-unique)
--           0 (unique mers)

[TRIMMING/TRIMMING]
--  PARAMETERS:
--  ----------
--     1000    (reads trimmed below this many bases are deleted)
--   0.1200    (use overlaps at or below this fraction error)
--      500    (break region if overlap is less than this long, for 'largest covered' algorithm)
--        2    (break region if overlap coverage is less than this many reads, for 'largest covered' algorithm)
--
--  INPUT READS:
--  -----------
--     137 reads      1853223 bases (reads processed)
--       0 reads            0 bases (reads not processed, previously deleted)
--       0 reads            0 bases (reads not processed, in a library where trimming isn't allowed)
--
--  OUTPUT READS:
--  ------------
--     129 reads      1750940 bases (trimmed reads output)
--       1 reads        16201 bases (reads with no change, kept as is)
--       2 reads        13593 bases (reads with no overlaps, deleted)
--       5 reads         7193 bases (reads with short trimmed length, deleted)
--
--  TRIMMING DETAILS:
--  ----------------
--     107 reads        52871 bases (bases trimmed from the 5' end of a read)
--     117 reads        12425 bases (bases trimmed from the 3' end of a read)

[TRIMMING/SPLITTING]
--  PARAMETERS:
--  ----------
--     1000    (reads trimmed below this many bases are deleted)
--   0.1200    (use overlaps at or below this fraction error)
--  INPUT READS:
--  -----------
--     130 reads      1832437 bases (reads processed)
--       7 reads        20786 bases (reads not processed, previously deleted)
--       0 reads            0 bases (reads not processed, in a library where trimming isn't allowed)
--
--  PROCESSED:
--  --------
--       0 reads            0 bases (no overlaps)
--       0 reads            0 bases (no coverage after adjusting for trimming done already)
--       0 reads            0 bases (processed for chimera)
--       0 reads            0 bases (processed for spur)
--     130 reads      1832437 bases (processed for subreads)
--
--  READS WITH SIGNALS:
--  ------------------
--       0 reads            0 signals (number of 5' spur signal)
--       0 reads            0 signals (number of 3' spur signal)
--       0 reads            0 signals (number of chimera signal)
--       0 reads            0 signals (number of subread signal)
--
--  SIGNALS:
--  -------
--       0 reads            0 bases (size of 5' spur signal)
--       0 reads            0 bases (size of 3' spur signal)
--       0 reads            0 bases (size of chimera signal)
--       0 reads            0 bases (size of subread signal)
--
--  TRIMMING:
--  --------
--       0 reads            0 bases (trimmed from the 5' end of the read)
--       0 reads            0 bases (trimmed from the 3' end of the read)

[UNITIGGING/READS]
--
-- In sequence store './FCGR_reads.seqStore':
--   Found 130 reads.
--   Found 1767141 bases (8.83 times coverage).
--
--    G=1767141                          sum of  ||               length     num
--    NG         length     index       lengths  ||                range    seqs
--    ----- ------------ --------- ------------  ||  ------------------- -------
--    00010        58865         2       199398  ||       1002-2580           34|---------------------------------------------------------------
--    00020        44130         6       390542  ||       2581-4159           13|-------------------------
--    00030        41201        10       559351  ||       4160-5738           11|---------------------
--    00040        35741        14       713157  ||       5739-7317            5|----------
--    00050        30934        20       911047  ||       7318-8896            8|---------------
--    00060        22908        26      1071036  ||       8897-10475           6|------------
--    00070        16201        35      1245739  ||      10476-12054           8|---------------
--    00080        11087        48      1413934  ||      12055-13633           5|----------
--    00090         6856        68      1593738  ||      13634-15212           4|--------
--    00100         1002       129      1767141  ||      15213-16791           1|--
--    001.000x                 130      1767141  ||      16792-18370           2|----
--                                               ||      18371-19949           2|----
--                                               ||      19950-21528           1|--
--                                               ||      21529-23107           4|--------
--                                               ||      23108-24686           0|
--                                               ||      24687-26265           3|------
--                                               ||      26266-27844           0|
--                                               ||      27845-29423           1|--
--                                               ||      29424-31002           2|----
--                                               ||      31003-32581           1|--
--                                               ||      32582-34160           2|----
--                                               ||      34161-35739           2|----
--                                               ||      35740-37318           2|----
--                                               ||      37319-38897           0|
--                                               ||      38898-40476           1|--
--                                               ||      40477-42055           3|------
--                                               ||      42056-43634           2|----
--                                               ||      43635-45213           1|--
--                                               ||      45214-46792           1|--
--                                               ||      46793-48371           0|
--                                               ||      48372-49950           0|
--                                               ||      49951-51529           2|----
--                                               ||      51530-53108           0|
--                                               ||      53109-54687           0|
--                                               ||      54688-56266           0|
--                                               ||      56267-57845           0|
--                                               ||      57846-59424           1|--
--                                               ||      59425-61003           1|--
--                                               ||      61004-62582           0|
--                                               ||      62583-64161           0|
--                                               ||      64162-65740           0|
--                                               ||      65741-67319           0|
--                                               ||      67320-68898           0|
--                                               ||      68899-70477           0|
--                                               ||      70478-72056           0|
--                                               ||      72057-73635           0|
--                                               ||      73636-75214           0|
--                                               ||      75215-76793           0|
--                                               ||      76794-78372           0|
--                                               ||      78373-79951           1|--
--

[UNITIGGING/MERS]
--
--  22-mers                                                                                           Fraction
--    Occurrences   NumMers                                                                         Unique Total
--       1-     1         0                                                                        0.0000 0.0000
--       2-     2     25476 *****************************************************                  0.1438 0.0318
--       3-     4     29069 ************************************************************           0.2355 0.0622
--       5-     7     33293 *********************************************************************  0.3681 0.1275
--       8-    11     31098 *****************************************************************      0.5410 0.2579
--      12-    16     33375 ********************************************************************** 0.7002 0.4378
--      17-    22     22768 ***********************************************                        0.8952 0.7628
--      23-    29      1761 ***                                                                    0.9900 0.9667
--      30-    37       165                                                                        0.9980 0.9894
--      38-    46        77                                                                        0.9988 0.9927
--      47-    56        69                                                                        0.9992 0.9946
--      57-    67        30                                                                        0.9996 0.9968
--      68-    79        25                                                                        0.9998 0.9978
--      80-    92         6                                                                        0.9999 0.9990
--      93-   106         1                                                                        1.0000 0.9993
--     107-   121         0                                                                        0.0000 0.0000
--     122-   137         5                                                                        1.0000 0.9993
--     138-   154         0                                                                        0.0000 0.0000
--     155-   172         1                                                                        1.0000 0.9998
--     173-   191         1                                                                        1.0000 0.9999
--     192-   211         1                                                                        1.0000 1.0000
--
--           0 (max occurrences)
--     1603518 (total mers, non-unique)
--      177221 (distinct mers, non-unique)
--           0 (unique mers)

[UNITIGGING/OVERLAPS]
--   category            reads     %          read length        feature size or coverage  analysis
--   ----------------  -------  -------  ----------------------  ------------------------  --------------------
--   middle-missing          0    0.00        0.00 +- 0.00             0.00 +- 0.00       (bad trimming)
--   middle-hump             0    0.00        0.00 +- 0.00             0.00 +- 0.00       (bad trimming)
--   no-5-prime              2    1.54    37206.00 +- 5649.78          2.00 +- 2.83       (bad trimming)
--   no-3-prime              0    0.00        0.00 +- 0.00             0.00 +- 0.00       (bad trimming)
--
--   low-coverage            0    0.00        0.00 +- 0.00             0.00 +- 0.00       (easy to assemble, potential for lower quality consensus)
--   unique                 29   22.31     7091.86 +- 7747.27          7.83 +- 2.30       (easy to assemble, perfect, yay)
--   repeat-cont            48   36.92     5735.62 +- 6452.83         17.44 +- 2.13       (potential for consensus errors, no impact on assembly)
--   repeat-dove             0    0.00        0.00 +- 0.00             0.00 +- 0.00       (hard to assemble, likely won't assemble correctly or even at all)
--
--   span-repeat            10    7.69    28521.60 +- 25788.45     18028.20 +- 24306.72   (read spans a large repeat, usually easy to assemble)
--   uniq-repeat-cont       29   22.31    16700.83 +- 12656.01                            (should be uniquely placed, low potential for consensus errors, no impact on assembly)
--   uniq-repeat-dove        3    2.31    49760.67 +- 9567.53                             (will end contigs, potential to misassemble)
--   uniq-anchor             9    6.92    32548.11 +- 12260.16     11788.67 +- 10348.52   (repeat read, with unique section, probable bad read)

[UNITIGGING/ADJUSTMENT]
-- No report available.

[UNITIGGING/CONTIGS]
-- Found, in version 1, after unitig construction:
--   contigs:      1 sequences, total length 140146 bp (including 0 repeats of total length 0 bp).
--   bubbles:      1 sequences, total length 68768 bp.
--   unassembled:  2 sequences, total length 102356 bp.
--
-- Contig sizes based on genome size 200kbp:
--
--            NG (bp)  LG (contigs)    sum (bp)
--         ----------  ------------  ----------
--     10      140146             1      140146
--     20      140146             1      140146
--     30      140146             1      140146
--     40      140146             1      140146
--     50      140146             1      140146
--     60      140146             1      140146
--     70      140146             1      140146
--

[UNITIGGING/CONSENSUS]
-- Found, in version 2, after consensus generation:
--   contigs:      1 sequences, total length 140177 bp (including 0 repeats of total length 0 bp).
--   bubbles:      1 sequences, total length 68791 bp.
--   unassembled:  2 sequences, total length 102356 bp.
--
-- Contig sizes based on genome size 200kbp:
--
--            NG (bp)  LG (contigs)    sum (bp)
--         ----------  ------------  ----------
--     10      140177             1      140177
--     20      140177             1      140177
--     30      140177             1      140177
--     40      140177             1      140177
--     50      140177             1      140177
--     60      140177             1      140177
--     70      140177             1      140177

There is also a strange canu out -> canu-scripts/canu.12.out with the latter flashing.

Any insight on what has gone wrong would be much appreciated! Thank you

skoren commented 3 years ago

This looks like everything is complete, what is in the assembly folder and unitigging folders?

smf96 commented 3 years ago

Hello,

So I realised I hadn't removed all the contents of the folder from when I had tried this out before, so it still contained all the .fasta files and .gfa from a previous attempt with less data. Or it could have been our universities HPC system went offline momentarily. I re-ran my script without changing anything as understand canu can resume incomplete assemblies by examining whats in the assembly directory. This has successfully made the new output I was expecting, but I am still confused about a couple of things.

Firstly why the canu out -> canu-scripts/canu.12.out is still flashing? What does that mean if the assembly has now finished? I have attached a video below to show you.

canu_assemblyFolder

Also from that you can see the .report wasn't changed from the 24th after I re-ran on the 26th of October. So that's the same report I copied in at the top of this issue which says 1 contig of 140,177bp was generated, but in the contigs fasta file there are two sequences, 1 of 140,917bp and another of 39,609bp. Where do these differences come from?

Thank you so much for your input

skoren commented 3 years ago
  1. The flashing means the symlink is invalid, at some point the canu.21.out got removed or it was incorrectly created when you re-ran, either way it doesn't affect the assembly as this is just a log file link for convenience.

  2. The report didn't get updated because consensus had already run from the previous report. The last run just output the results files from pre-existing databases which doesn't make the report change. You can see you have 1 contig and 1 bubble. Those are the two sequences you see in the contigs.fastsa file, the header line should identify them as such as well.

smf96 commented 3 years ago

Brilliant, thanks very much for taking the time to explain!