Closed zhuzhuo closed 6 years ago
1.6x is not enough to get an assembly so, yes that's probably the cause. What is the full report file say (lbv.report) and what did the canu output say before exiting.
Thanks for the reply Sergey. Here is the report file:
> [CORRECTION/READS]
> --
> -- In gatekeeper store './lbv.gkpStore':
> -- Found 320296 reads.
> -- Found 1161644503 bases (1.93 times coverage).
> --
> -- Read length histogram (one '*' equals 1773.45 reads):
> -- 0 999 0
> -- 1000 1999 124142 **********************************************************************
> -- 2000 2999 78777 ********************************************
> -- 3000 3999 8554 ****
> -- 4000 4999 33540 ******************
> -- 5000 5999 22503 ************
> -- 6000 6999 15460 ********
> -- 7000 7999 10528 *****
> -- 8000 8999 7385 ****
> -- 9000 9999 5227 **
> -- 10000 10999 3745 **
> -- 11000 11999 2711 *
> -- 12000 12999 1951 *
> -- 13000 13999 1457
> -- 14000 14999 1182
> -- 15000 15999 802
> -- 16000 16999 591
> -- 17000 17999 464
> -- 18000 18999 310
> -- 19000 19999 239
> -- 20000 20999 179
> -- 21000 21999 129
> -- 22000 22999 101
> -- 23000 23999 80
> -- 24000 24999 55
> -- 25000 25999 42
> -- 26000 26999 36
> -- 27000 27999 27
> -- 28000 28999 19
> -- 29000 29999 11
> -- 30000 30999 12
> -- 31000 31999 4
> -- 32000 32999 7
> -- 33000 33999 6
> -- 34000 34999 2
> -- 35000 35999 3
> -- 36000 36999 3
> -- 37000 37999 3
> -- 38000 38999 0
> -- 39000 39999 0
> -- 40000 40999 1
> -- 41000 41999 0
> -- 42000 42999 2
> -- 43000 43999 0
> -- 44000 44999 0
> -- 45000 45999 0
> -- 46000 46999 2
> -- 47000 47999 0
> -- 48000 48999 0
> -- 49000 49999 0
> -- 50000 50999 0
> -- 51000 51999 1
> -- 52000 52999 0
> -- 53000 53999 0
> -- 54000 54999 0
> -- 55000 55999 2
> -- 56000 56999 0
> -- 57000 57999 1
>
> [CORRECTION/MERS]
> --
> -- 16-mers Fraction
> -- Occurrences NumMers Unique Total
> -- 1- 1 317747818 *******************************************************************--> 0.6362 0.2747
> -- 2- 2 89613818 ********************************************************************** 0.8156 0.4296
> -- 3- 4 54628932 ****************************************** 0.8886 0.5241
> -- 5- 7 20944101 **************** 0.9457 0.6316
> -- 8- 11 8052438 ****** 0.9729 0.7109
> -- 12- 16 3593734 ** 0.9851 0.7647
> -- 17- 22 1830377 * 0.9911 0.8026
> -- 23- 29 1014776 0.9943 0.8305
> -- 30- 37 599595 0.9962 0.8517
> -- 38- 46 377946 0.9973 0.8680
> -- 47- 56 250649 0.9980 0.8812
> -- 57- 67 172017 0.9985 0.8919
> -- 68- 79 122693 0.9988 0.9009
> -- 80- 92 90934 0.9990 0.9084
> -- 93- 106 70141 0.9992 0.9151
> -- 107- 121 56334 0.9994 0.9210
> -- 122- 137 45687 0.9995 0.9265
> -- 138- 154 37415 0.9996 0.9315
> -- 155- 172 29634 0.9996 0.9362
> -- 173- 191 24309 0.9997 0.9403
> -- 192- 211 19570 0.9997 0.9441
> -- 212- 232 16145 0.9998 0.9475
> -- 233- 254 13286 0.9998 0.9506
> -- 255- 277 11161 0.9998 0.9533
> -- 278- 301 9689 0.9999 0.9559
> -- 302- 326 7856 0.9999 0.9583
> -- 327- 352 7017 0.9999 0.9604
> -- 353- 379 5825 0.9999 0.9625
> -- 380- 407 5017 0.9999 0.9643
> -- 408- 436 4389 0.9999 0.9660
> -- 437- 466 3895 0.9999 0.9676
> -- 467- 497 3413 0.9999 0.9691
> -- 498- 529 3050 1.0000 0.9705
> -- 530- 562 2680 1.0000 0.9719
> -- 563- 596 2477 1.0000 0.9731
> -- 597- 631 2179 1.0000 0.9744
> -- 632- 667 1784 1.0000 0.9755
> -- 668- 704 1575 1.0000 0.9765
> -- 705- 742 1296 1.0000 0.9775
> -- 743- 781 1039 1.0000 0.9783
> -- 782- 821 838 1.0000 0.9789
> --
> -- 554147 (max occurrences)
> -- 839092245 (total mers, non-unique)
> -- 181687509 (distinct mers, non-unique)
> -- 317747818 (unique mers)
Canu.out:
> -- In 'lbv.gkpStore', found Nanopore reads:
> -- Raw: 320296
> -- Corrected: 0
> -- Trimmed: 0
> --
> -- Generating assembly 'lbv' in '/mnt/maximus/data1/genome_science/minION/lbv'
> --
> -- Parameters:
> --
> -- genomeSize 600000000
> --
> -- Overlap Generation Limits:
> -- corOvlErrorRate 0.3200 ( 32.00%)
> -- obtOvlErrorRate 0.1440 ( 14.40%)
> -- utgOvlErrorRate 0.1440 ( 14.40%)
> --
> -- Overlap Processing Limits:
> -- corErrorRate 0.5000 ( 50.00%)
> -- obtErrorRate 0.1440 ( 14.40%)
> -- utgErrorRate 0.1440 ( 14.40%)
> -- cnsErrorRate 0.1920 ( 19.20%)
> --
> --
> -- BEGIN CORRECTION
> --
> -- All 14 mhap precompute jobs finished successfully.
> --
> -- Running jobs. First attempt out of 2.
> --
> -- 'mhap.jobSubmit-01.sh' -> job 17576 tasks 1-50.
> --
> ----------------------------------------
> -- Starting command on Fri Apr 6 15:41:57 2018 with 22228.834 GB free disk space
>
> cd /mnt/maximus/data1/genome_science/minION/lbv
> sbatch \
> --depend=afterany:17576 \
> --mem-per-cpu=4g \
> --cpus-per-task=1 \
> -D `pwd` \
> -J 'canu_lbv' \
> -o canu-scripts/canu.03.out canu-scripts/canu.03.sh
> Submitted batch job 17577
>
> -- Finished on Fri Apr 6 15:41:57 2018 (lickety-split) with 22228.834 GB free disk space
It didn't report any error.
That looks like it's still running, it submitted itself as a job, you should be able to see those jobs in your queue. So there's not error, it's still running. However, I don't expect very good results given 1x coverage, at best you'll assemble some repeats or maybe a mitochondria.
I don't think it is running. I don't have any job running in the queue.
Then there should be something in canu-scripts/canu.03.out
and correction/1-overlapper/mhap.*.out
.
I did a correction-only run, now I got this:
-- Mhap overlap jobs failed, tried 2 times, giving up.
-- job correction/1-overlapper/results/000016.ovb FAILED.
-- job correction/1-overlapper/results/000017.ovb FAILED.
--
ABORT:
ABORT: Canu 1.7
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting. If that doesn't work, ask for help.
ABORT:
What's the output in correction/1-overlapper/*16*out
?
I don't have canu.03.out
. canu.02.out
is the same as the canu.out
. In correction/1-overlapper/*16*out
, I found this:
slurmstepd-biomix29: error: Exceeded step memory limit at some point.
So maybe the memory problem? The memory I required was 128000M.
How did you specify memory? Canu auto-requests the correct memory based on your grid so if you're manually overwriting that it may not be correct.
We are using slurm system, I did this:
#SBATCH --ntasks=8 #cores or threads
#SBATCH --mem=128000
Yeah, canu did a scan:
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '1.8.0_141' (from 'java').
-- Detected gnuplot version '5.0 patchlevel 3' (from 'gnuplot') and image format 'png'.
-- Detected 72 CPUs and 504 gigabytes of memory.
-- Detected Slurm with 'sinfo' binary in /usr/local/bin/sinfo.
-- Detected Slurm with 'MaxArraySize' limited to 1000 jobs.
--
-- Found 1 host with 72 cores and 501 GB memory under Slurm control.
-- Found 1 host with 48 cores and 376 GB memory under Slurm control.
-- Found 1 host with 64 cores and 755 GB memory under Slurm control.
-- Found 8 hosts with 32 cores and 250 GB memory under Slurm control.
-- Found 1 host with 16 cores and 125 GB memory under Slurm control.
-- Found 1 host with 24 cores and 62 GB memory under Slurm control.
-- Found 3 hosts with 48 cores and 501 GB memory under Slurm control.
-- Found 1 host with 24 cores and 250 GB memory under Slurm control.
-- Found 1 host with 32 cores and 99 GB memory under Slurm control.
-- Found 2 hosts with 12 cores and 125 GB memory under Slurm control.
-- Found 1 host with 80 cores and 99 GB memory under Slurm control.
-- Found 2 hosts with 48 cores and 250 GB memory under Slurm control.
-- Found 1 host with 40 cores and 37 GB memory under Slurm control.
-- Found 1 host with 12 cores and 250 GB memory under Slurm control.
-- Found 1 host with 96 cores and 201 GB memory under Slurm control.
-- Found 1 host with 16 cores and 62 GB memory under Slurm control.
-- Found 1 host with 24 cores and 188 GB memory under Slurm control.
--
-- (tag)Threads
-- (tag)Memory |
-- (tag) | | algorithm
-- ------- ------ -------- -----------------------------
-- Grid: meryl 64 GB 16 CPUs (k-mer counting)
-- Grid: cormhap 16 GB 8 CPUs (overlap detection with mhap)
-- Grid: obtovl 16 GB 8 CPUs (overlap detection)
-- Grid: utgovl 16 GB 8 CPUs (overlap detection)
-- Grid: ovb 2 GB 1 CPU (overlap store bucketizer)
-- Grid: ovs 16 GB 1 CPU (overlap store sorting)
-- Grid: red 4 GB 4 CPUs (read error detection)
-- Grid: oea 4 GB 1 CPU (overlap error adjustment)
-- Grid: bat 256 GB 16 CPUs (contig construction)
-- Grid: gfa 16 GB 16 CPUs (GFA alignment and processing)
I'm not sure if the memory I specified would cause problems, but it seems the memory canu configured didn't exceed the memory I requested.
Can you upload the mhap.sh and mhap.jobSubmit-01.sh scripts? Those will have the explicit requests.
Also exceeding memory limit at some point doesn't mean it was killed, what is the full log and the job history of that job (sacct -j <JOBID>
)?
Yes, you are right. I don't think it got skilled. The job history says "COMPLETED ". The full log is the same as canu.out
.
Slurm sometimes doesn't report the cause of the exit, it might be a timeout issue also since the majority of the jobs ran to completion, only 16/17 had issues. I meant the full log of correction/1-overlapper/*16*out
. You could try just re-launching Canu with more memory/time for the mhap step with gridOptionscormhap="--time=72:00:00 --mem=20g"
.
I'll try it. Thank you so much for the help! I really appreciate it.
Will update after I get the results.
Hi Sergey, I added gridOptionscormhap="--time=72:00:00 --mem=20g"
to my canu correction run,
Now I get Overlap store sorting jobs failed, tried 2 times, giving up
and multiple *.ovlStore.BUILDING/* FAILED.
Probably I/O issues, there will again be more info in the individual log files (e.g. *.ovlStore.BUILDING/logs/*.out
).
This issue has drifted quite a bit from the initial question so open a new issue if you want to fix the store error. I am not sure how much effort you want to invest in this assembly given 2x coverage.
Hello Sergey,
My canu run doesn't output genome assembly. It only generated correction folder. It seems to me that trim and assemble were aborted, but no error was thrown in the log file.
This is my command:
canu -d lbv -p lbv genomeSize=600m -nanopore-raw ./basecalled/workspace/pass/lbv.fastq.gz
I found in the log file that the coverage is only 1.93 times
Is this why canu didn't continue to create the genome assembly?
Thank you for your time.
Best, Zhu