marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
654 stars 179 forks source link

No genome assembly generated #862

Closed zhuzhuo closed 6 years ago

zhuzhuo commented 6 years ago

Hello Sergey,

My canu run doesn't output genome assembly. It only generated correction folder. It seems to me that trim and assemble were aborted, but no error was thrown in the log file.

This is my command: canu -d lbv -p lbv genomeSize=600m -nanopore-raw ./basecalled/workspace/pass/lbv.fastq.gz

I found in the log file that the coverage is only 1.93 times

-- In gatekeeper store './lbv.gkpStore':
--   Found 320296 reads.
--   Found 1161644503 bases (1.93 times coverage).

Is this why canu didn't continue to create the genome assembly?

Thank you for your time.

Best, Zhu

skoren commented 6 years ago

1.6x is not enough to get an assembly so, yes that's probably the cause. What is the full report file say (lbv.report) and what did the canu output say before exiting.

zhuzhuo commented 6 years ago

Thanks for the reply Sergey. Here is the report file:

> [CORRECTION/READS]
> --
> -- In gatekeeper store './lbv.gkpStore':
> --   Found 320296 reads.
> --   Found 1161644503 bases (1.93 times coverage).
> --
> --   Read length histogram (one '*' equals 1773.45 reads):
> --        0    999      0 
> --     1000   1999 124142 **********************************************************************
> --     2000   2999  78777 ********************************************
> --     3000   3999   8554 ****
> --     4000   4999  33540 ******************
> --     5000   5999  22503 ************
> --     6000   6999  15460 ********
> --     7000   7999  10528 *****
> --     8000   8999   7385 ****
> --     9000   9999   5227 **
> --    10000  10999   3745 **
> --    11000  11999   2711 *
> --    12000  12999   1951 *
> --    13000  13999   1457 
> --    14000  14999   1182 
> --    15000  15999    802 
> --    16000  16999    591 
> --    17000  17999    464 
> --    18000  18999    310 
> --    19000  19999    239 
> --    20000  20999    179 
> --    21000  21999    129 
> --    22000  22999    101 
> --    23000  23999     80 
> --    24000  24999     55 
> --    25000  25999     42 
> --    26000  26999     36 
> --    27000  27999     27 
> --    28000  28999     19 
> --    29000  29999     11 
> --    30000  30999     12 
> --    31000  31999      4 
> --    32000  32999      7 
> --    33000  33999      6 
> --    34000  34999      2 
> --    35000  35999      3 
> --    36000  36999      3 
> --    37000  37999      3 
> --    38000  38999      0 
> --    39000  39999      0 
> --    40000  40999      1 
> --    41000  41999      0 
> --    42000  42999      2 
> --    43000  43999      0 
> --    44000  44999      0 
> --    45000  45999      0 
> --    46000  46999      2 
> --    47000  47999      0 
> --    48000  48999      0 
> --    49000  49999      0 
> --    50000  50999      0 
> --    51000  51999      1 
> --    52000  52999      0 
> --    53000  53999      0 
> --    54000  54999      0 
> --    55000  55999      2 
> --    56000  56999      0 
> --    57000  57999      1
> 
> [CORRECTION/MERS]
> --
> --  16-mers                                                                                           Fraction
> --    Occurrences   NumMers                                                                         Unique Total
> --       1-     1 317747818 *******************************************************************--> 0.6362 0.2747
> --       2-     2  89613818 ********************************************************************** 0.8156 0.4296
> --       3-     4  54628932 ******************************************                             0.8886 0.5241
> --       5-     7  20944101 ****************                                                       0.9457 0.6316
> --       8-    11   8052438 ******                                                                 0.9729 0.7109
> --      12-    16   3593734 **                                                                     0.9851 0.7647
> --      17-    22   1830377 *                                                                      0.9911 0.8026
> --      23-    29   1014776                                                                        0.9943 0.8305
> --      30-    37    599595                                                                        0.9962 0.8517
> --      38-    46    377946                                                                        0.9973 0.8680
> --      47-    56    250649                                                                        0.9980 0.8812
> --      57-    67    172017                                                                        0.9985 0.8919
> --      68-    79    122693                                                                        0.9988 0.9009
> --      80-    92     90934                                                                        0.9990 0.9084
> --      93-   106     70141                                                                        0.9992 0.9151
> --     107-   121     56334                                                                        0.9994 0.9210
> --     122-   137     45687                                                                        0.9995 0.9265
> --     138-   154     37415                                                                        0.9996 0.9315
> --     155-   172     29634                                                                        0.9996 0.9362
> --     173-   191     24309                                                                        0.9997 0.9403
> --     192-   211     19570                                                                        0.9997 0.9441
> --     212-   232     16145                                                                        0.9998 0.9475
> --     233-   254     13286                                                                        0.9998 0.9506
> --     255-   277     11161                                                                        0.9998 0.9533
> --     278-   301      9689                                                                        0.9999 0.9559
> --     302-   326      7856                                                                        0.9999 0.9583
> --     327-   352      7017                                                                        0.9999 0.9604
> --     353-   379      5825                                                                        0.9999 0.9625
> --     380-   407      5017                                                                        0.9999 0.9643
> --     408-   436      4389                                                                        0.9999 0.9660
> --     437-   466      3895                                                                        0.9999 0.9676
> --     467-   497      3413                                                                        0.9999 0.9691
> --     498-   529      3050                                                                        1.0000 0.9705
> --     530-   562      2680                                                                        1.0000 0.9719
> --     563-   596      2477                                                                        1.0000 0.9731
> --     597-   631      2179                                                                        1.0000 0.9744
> --     632-   667      1784                                                                        1.0000 0.9755
> --     668-   704      1575                                                                        1.0000 0.9765
> --     705-   742      1296                                                                        1.0000 0.9775
> --     743-   781      1039                                                                        1.0000 0.9783
> --     782-   821       838                                                                        1.0000 0.9789
> --
> --      554147 (max occurrences)
> --   839092245 (total mers, non-unique)
> --   181687509 (distinct mers, non-unique)
> --   317747818 (unique mers)

Canu.out:

> -- In 'lbv.gkpStore', found Nanopore reads:
> --   Raw:        320296
> --   Corrected:  0
> --   Trimmed:    0
> --
> -- Generating assembly 'lbv' in '/mnt/maximus/data1/genome_science/minION/lbv'
> --
> -- Parameters:
> --
> --  genomeSize        600000000
> --
> --  Overlap Generation Limits:
> --    corOvlErrorRate 0.3200 ( 32.00%)
> --    obtOvlErrorRate 0.1440 ( 14.40%)
> --    utgOvlErrorRate 0.1440 ( 14.40%)
> --
> --  Overlap Processing Limits:
> --    corErrorRate    0.5000 ( 50.00%)
> --    obtErrorRate    0.1440 ( 14.40%)
> --    utgErrorRate    0.1440 ( 14.40%)
> --    cnsErrorRate    0.1920 ( 19.20%)
> --
> --
> -- BEGIN CORRECTION
> --
> -- All 14 mhap precompute jobs finished successfully.
> --
> -- Running jobs.  First attempt out of 2.
> --
> -- 'mhap.jobSubmit-01.sh' -> job 17576 tasks 1-50.
> --
> ----------------------------------------
> -- Starting command on Fri Apr  6 15:41:57 2018 with 22228.834 GB free disk space
> 
>     cd /mnt/maximus/data1/genome_science/minION/lbv
>     sbatch \
>       --depend=afterany:17576 \
>       --mem-per-cpu=4g \
>       --cpus-per-task=1   \
>       -D `pwd` \
>       -J 'canu_lbv' \
>       -o canu-scripts/canu.03.out canu-scripts/canu.03.sh
> Submitted batch job 17577
> 
> -- Finished on Fri Apr  6 15:41:57 2018 (lickety-split) with 22228.834 GB free disk space

It didn't report any error.

skoren commented 6 years ago

That looks like it's still running, it submitted itself as a job, you should be able to see those jobs in your queue. So there's not error, it's still running. However, I don't expect very good results given 1x coverage, at best you'll assemble some repeats or maybe a mitochondria.

zhuzhuo commented 6 years ago

I don't think it is running. I don't have any job running in the queue.

skoren commented 6 years ago

Then there should be something in canu-scripts/canu.03.out and correction/1-overlapper/mhap.*.out.

zhuzhuo commented 6 years ago

I did a correction-only run, now I got this:

 -- Mhap overlap jobs failed, tried 2 times, giving up.
 --   job correction/1-overlapper/results/000016.ovb FAILED.
 --   job correction/1-overlapper/results/000017.ovb FAILED.
 --

 ABORT:
 ABORT: Canu 1.7
 ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
 ABORT: Try restarting.  If that doesn't work, ask for help.
 ABORT:
skoren commented 6 years ago

What's the output in correction/1-overlapper/*16*out?

zhuzhuo commented 6 years ago

I don't have canu.03.out. canu.02.out is the same as the canu.out. In correction/1-overlapper/*16*out, I found this:

slurmstepd-biomix29: error: Exceeded step memory limit at some point.

So maybe the memory problem? The memory I required was 128000M.

skoren commented 6 years ago

How did you specify memory? Canu auto-requests the correct memory based on your grid so if you're manually overwriting that it may not be correct.

zhuzhuo commented 6 years ago

We are using slurm system, I did this:

#SBATCH --ntasks=8 #cores or threads
#SBATCH --mem=128000

Yeah, canu did a scan:

-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '1.8.0_141' (from 'java').
-- Detected gnuplot version '5.0 patchlevel 3' (from 'gnuplot') and image format 'png'.
-- Detected 72 CPUs and 504 gigabytes of memory.
-- Detected Slurm with 'sinfo' binary in /usr/local/bin/sinfo.
-- Detected Slurm with 'MaxArraySize' limited to 1000 jobs.
-- 
-- Found   1 host  with  72 cores and  501 GB memory under Slurm control.
-- Found   1 host  with  48 cores and  376 GB memory under Slurm control.
-- Found   1 host  with  64 cores and  755 GB memory under Slurm control.
-- Found   8 hosts with  32 cores and  250 GB memory under Slurm control.
-- Found   1 host  with  16 cores and  125 GB memory under Slurm control.
-- Found   1 host  with  24 cores and   62 GB memory under Slurm control.
-- Found   3 hosts with  48 cores and  501 GB memory under Slurm control.
-- Found   1 host  with  24 cores and  250 GB memory under Slurm control.
-- Found   1 host  with  32 cores and   99 GB memory under Slurm control.
-- Found   2 hosts with  12 cores and  125 GB memory under Slurm control.
-- Found   1 host  with  80 cores and   99 GB memory under Slurm control.
-- Found   2 hosts with  48 cores and  250 GB memory under Slurm control.
-- Found   1 host  with  40 cores and   37 GB memory under Slurm control.
-- Found   1 host  with  12 cores and  250 GB memory under Slurm control.
-- Found   1 host  with  96 cores and  201 GB memory under Slurm control.
-- Found   1 host  with  16 cores and   62 GB memory under Slurm control.
-- Found   1 host  with  24 cores and  188 GB memory under Slurm control.
--
--                     (tag)Threads
--            (tag)Memory         |
--        (tag)         |         |  algorithm
--        -------  ------  --------  -----------------------------
-- Grid:  meryl     64 GB   16 CPUs  (k-mer counting)
-- Grid:  cormhap   16 GB    8 CPUs  (overlap detection with mhap)
-- Grid:  obtovl    16 GB    8 CPUs  (overlap detection)
-- Grid:  utgovl    16 GB    8 CPUs  (overlap detection)
-- Grid:  ovb        2 GB    1 CPU   (overlap store bucketizer)
-- Grid:  ovs       16 GB    1 CPU   (overlap store sorting)
-- Grid:  red        4 GB    4 CPUs  (read error detection)
-- Grid:  oea        4 GB    1 CPU   (overlap error adjustment)
-- Grid:  bat      256 GB   16 CPUs  (contig construction)
-- Grid:  gfa       16 GB   16 CPUs  (GFA alignment and processing)

I'm not sure if the memory I specified would cause problems, but it seems the memory canu configured didn't exceed the memory I requested.

skoren commented 6 years ago

Can you upload the mhap.sh and mhap.jobSubmit-01.sh scripts? Those will have the explicit requests.

Also exceeding memory limit at some point doesn't mean it was killed, what is the full log and the job history of that job (sacct -j <JOBID>)?

zhuzhuo commented 6 years ago

Yes mhap.sh.txt mhap.jobSubmit-01.sh.txt

zhuzhuo commented 6 years ago

Yes, you are right. I don't think it got skilled. The job history says "COMPLETED ". The full log is the same as canu.out.

skoren commented 6 years ago

Slurm sometimes doesn't report the cause of the exit, it might be a timeout issue also since the majority of the jobs ran to completion, only 16/17 had issues. I meant the full log of correction/1-overlapper/*16*out. You could try just re-launching Canu with more memory/time for the mhap step with gridOptionscormhap="--time=72:00:00 --mem=20g".

zhuzhuo commented 6 years ago

I'll try it. Thank you so much for the help! I really appreciate it.

Will update after I get the results.

zhuzhuo commented 6 years ago

Hi Sergey, I added gridOptionscormhap="--time=72:00:00 --mem=20g" to my canu correction run, Now I get Overlap store sorting jobs failed, tried 2 times, giving up and multiple *.ovlStore.BUILDING/* FAILED.

skoren commented 6 years ago

Probably I/O issues, there will again be more info in the individual log files (e.g. *.ovlStore.BUILDING/logs/*.out).

skoren commented 6 years ago

This issue has drifted quite a bit from the initial question so open a new issue if you want to fix the store error. I am not sure how much effort you want to invest in this assembly given 2x coverage.