marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
660 stars 179 forks source link

Mhap precompute jobs failed, tried 2 times, giving up. #2253

Closed Layla-mohd closed 1 year ago

Layla-mohd commented 1 year ago

Hello,

I have finished an assembly using Linux system, however, I am facing a problem with another assembly.

I am using Canu 2.2.

I used the following command:

canu -d /data/Dugong_Project/blasto_ont4/DG14_Analysis/canu -p canu_1_DG14 genomeSize=1.8k minOverlapLength=1000 corOutCoverage=1000000 -nanopore-raw /data/Dugong_Project/blasto_ont4/DG15_Analysis/DG14_filtered.fastq.gz

The error is:

-- canu 2.2
--
-- CITATIONS
--
-- For 'standard' assemblies of PacBio or Nanopore reads:
--   Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM.
--   Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.
--   Genome Res. 2017 May;27(5):722-736.
--   http://doi.org/10.1101/gr.215087.116
-- 
-- Read and contig alignments during correction and consensus use:
--   Šošic M, Šikic M.
--   Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
--   Bioinformatics. 2017 May 1;33(9):1394-1395.
--   http://doi.org/10.1093/bioinformatics/btw753
-- 
-- Overlaps are generated using:
--   Berlin K, et al.
--   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
--   Nat Biotechnol. 2015 Jun;33(6):623-30.
--   http://doi.org/10.1038/nbt.3238
-- 
--   Myers EW, et al.
--   A Whole-Genome Assembly of Drosophila.
--   Science. 2000 Mar 24;287(5461):2196-204.
--   http://doi.org/10.1126/science.287.5461.2196
-- 
-- Corrected read consensus sequences are generated using an algorithm derived from FALCON-sense:
--   Chin CS, et al.
--   Phased diploid genome assembly with single-molecule real-time sequencing.
--   Nat Methods. 2016 Dec;13(12):1050-1054.
--   http://doi.org/10.1038/nmeth.4035
-- 
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
--   Chin CS, et al.
--   Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
--   Nat Methods. 2013 Jun;10(6):563-9
--   http://doi.org/10.1038/nmeth.2474
-- 
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '11.0.15-internal' (from 'java') without -d64 support.
-- Detected gnuplot version '5.4 patchlevel 8   ' (from 'gnuplot') and image format 'png'.
--
-- Detected 8 CPUs and 63 gigabytes of memory on the local machine.
--
-- Local machine mode enabled; grid support not detected or not allowed.
--
--                                (tag)Concurrency
--                         (tag)Threads          |
--                (tag)Memory         |          |
--        (tag)             |         |          |       total usage      algorithm
--        -------  ----------  --------   --------  --------------------  -----------------------------
-- Local: meryl     12.000 GB    4 CPUs x   2 jobs    24.000 GB   8 CPUs  (k-mer counting)
-- Local: hap        8.000 GB    4 CPUs x   2 jobs    16.000 GB   8 CPUs  (read-to-haplotype assignment)
-- Local: cormhap    6.000 GB    8 CPUs x   1 job      6.000 GB   8 CPUs  (overlap detection with mhap)
-- Local: obtovl     4.000 GB    8 CPUs x   1 job      4.000 GB   8 CPUs  (overlap detection)
-- Local: utgovl     4.000 GB    8 CPUs x   1 job      4.000 GB   8 CPUs  (overlap detection)
-- Local: cor        -.--- GB    4 CPUs x   - jobs     -.--- GB   - CPUs  (read correction)
-- Local: ovb        4.000 GB    1 CPU  x   8 jobs    32.000 GB   8 CPUs  (overlap store bucketizer)
-- Local: ovs        8.000 GB    1 CPU  x   7 jobs    56.000 GB   7 CPUs  (overlap store sorting)
-- Local: red       16.000 GB    4 CPUs x   2 jobs    32.000 GB   8 CPUs  (read error detection)
-- Local: oea        8.000 GB    1 CPU  x   7 jobs    56.000 GB   7 CPUs  (overlap error adjustment)
-- Local: bat       16.000 GB    4 CPUs x   1 job     16.000 GB   4 CPUs  (contig construction with bogart)
-- Local: cns        -.--- GB    4 CPUs x   - jobs     -.--- GB   - CPUs  (consensus)
--
-- Found Nanopore reads in 'canu_1_DG14.seqStore':
--   Libraries:
--     Nanopore:              1
--   Reads:
--     Raw:                   360438
--
--
-- Generating assembly 'canu_1_DG14' in '/data/Dugong_Project/blasto_ont4/DG14_Analysis/Layla/canu':
--   genomeSize:
--     1800
--
--   Overlap Generation Limits:
--     corOvlErrorRate 0.3200 ( 32.00%)
--     obtOvlErrorRate 0.1200 ( 12.00%)
--     utgOvlErrorRate 0.1200 ( 12.00%)
--
--   Overlap Processing Limits:
--     corErrorRate    0.3000 ( 30.00%)
--     obtErrorRate    0.1200 ( 12.00%)
--     utgErrorRate    0.1200 ( 12.00%)
--     cnsErrorRate    0.2000 ( 20.00%)
--
--   Stages to run:
--     correct raw reads.
--     trim corrected reads.
--     assemble corrected and trimmed reads.
--
--
-- BEGIN CORRECTION
--
-- OVERLAPPER (mhap) (correction) complete, not rewriting scripts.
--
--
-- Running jobs.  First attempt out of 2.
----------------------------------------
-- Starting 'cormhap' concurrent execution on Sun Aug 20 10:23:16 2023 with 468.767 GB free disk space (5 processes; 1 concurrently)

    cd correction/1-overlapper
    ./precompute.sh 6 > ./precompute.000006.out 2>&1
    ./precompute.sh 36 > ./precompute.000036.out 2>&1
    ./precompute.sh 51 > ./precompute.000051.out 2>&1
    ./precompute.sh 80 > ./precompute.000080.out 2>&1
    ./precompute.sh 87 > ./precompute.000087.out 2>&1

-- Finished on Sun Aug 20 10:23:17 2023 (one second) with 468.767 GB free disk space
----------------------------------------
--
-- Mhap precompute jobs failed, retry.
--   job correction/1-overlapper/blocks/000006.dat FAILED.
--   job correction/1-overlapper/blocks/000036.dat FAILED.
--   job correction/1-overlapper/blocks/000051.dat FAILED.
--   job correction/1-overlapper/blocks/000080.dat FAILED.
--   job correction/1-overlapper/blocks/000087.dat FAILED.
--
--
-- Running jobs.  Second attempt out of 2.
----------------------------------------
-- Starting 'cormhap' concurrent execution on Sun Aug 20 10:23:17 2023 with 468.767 GB free disk space (5 processes; 1 concurrently)

    cd correction/1-overlapper
    ./precompute.sh 6 > ./precompute.000006.out 2>&1
    ./precompute.sh 36 > ./precompute.000036.out 2>&1
    ./precompute.sh 51 > ./precompute.000051.out 2>&1
    ./precompute.sh 80 > ./precompute.000080.out 2>&1
    ./precompute.sh 87 > ./precompute.000087.out 2>&1

-- Finished on Sun Aug 20 10:23:18 2023 (one second) with 468.767 GB free disk space
----------------------------------------
--
-- Mhap precompute jobs failed, tried 2 times, giving up.
--   job correction/1-overlapper/blocks/000006.dat FAILED.
--   job correction/1-overlapper/blocks/000036.dat FAILED.
--   job correction/1-overlapper/blocks/000051.dat FAILED.
--   job correction/1-overlapper/blocks/000080.dat FAILED.
--   job correction/1-overlapper/blocks/000087.dat FAILED.
--

ABORT:
ABORT: canu 2.2
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting.  If that doesn't work, ask for help.
ABORT:

Can you please assist with the problem. Thank you in advance

skoren commented 1 year ago

Given that most jobs succeeded and only a few failed, I suspect either disk space or similar. What's the contents of one of the failed jobs (e.g. correction/1-overlapper/precompute.000006.out)?

Layla-mohd commented 1 year ago

Hi Sergey, The content of correction/1-overlapper/precompute.000006.out is below:

Found perl:
   /home/grid/miniconda3/bin/perl
   This is perl 5, version 32, subversion 1 (v5.32.1) built for x86_64-linux-thread-multi

Found java:
   /home/grid/miniconda3/bin/java
   openjdk version "11.0.15-internal" 2022-04-19

Found canu:
   /home/grid/miniconda3/bin/canu
   canu 2.2

Running job 6 based on command line options.
/data/Dugong_Project/blasto_ont4/DG14_Analysis/Layla/canu/correction/1-overlapper
Opened seqStore '../../canu_1_DG14.seqStore' for 'raw' reads.
Dumping raw reads from 81001 to 97200 (inclusive).
Failed to extract fasta.

Thanks.

skoren commented 1 year ago

I've seen this error before but not reproduced locally. Usually it's either an issue with insufficient space or other disk issue. What message do you get if you run the extract command in the correction/1-overlapper/precompute.000006.sh file manually, something like:

sqStoreDumpFASTQ -S correction/canu_1_DG14.seqStore -r 81001-97200 -nolibname -noreadname -fasta -o tmp.input
Layla-mohd commented 1 year ago

I ran the command as you have advised and got this

sqStore()--  failed to open '/data/Dugong_Project/blasto_ont4/DG14_Analysis/Layla/canu/correction/canu_1_DG14.seqStore' for read-only access: store doesn't exist.

However, I ran it with another path and got this message

sqStoreDumpFASTQ -S /data/Dugong_Project/blasto_ont4/DG14_Analysis/Layla/canu/canu_1_DG14.seqStore -r 81001-97200 -nolibname -noreadname -fasta -o tmp.input

Opened seqStore '/data/Dugong_Project/blasto_ont4/DG14_Analysis/Layla/canu/canu_1_DG14.seqStore' for 'raw' reads.
Dumping raw reads from 81001 to 97200 (inclusive).
skoren commented 1 year ago

The fact that the dump is working but was failing as part of the run seems to indicate some kind of intermittent FS issue. What if you run the precompute.sh script by hand from inside the correction/1-overlapper folder (sh precompute.sh 6)?

Layla-mohd commented 1 year ago

Hi, I received the following message:

Running job 6 based on command line options.
/data/Dugong_Project/blasto_ont4/DG14_Analysis/Layla/canu/correction/1-overlapper
Opened seqStore '../../canu_1_DG14.seqStore' for 'raw' reads.
Dumping raw reads from 81001 to 97200 (inclusive).
Failed to extract fasta.
skoren commented 1 year ago

It's strange that it is running the dump command and not reporting an error but then failing in the script. Can you run the dump you ran by hand then run echo $?? That should report the return value from the job to see if it's returning a failure or not. Also, check the dumped fasta output to check if it's valid or not.

skoren commented 1 year ago

Idle