marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
654 stars 179 forks source link

disk space problem #1363

Closed apoosakkannu closed 5 years ago

apoosakkannu commented 5 years ago

Hi, I am new to the metagenomics analysis. I am using the canu for the first time. I am working on a department server. I have attached the command and the error in a file. As I am new it would be nice to get some detailed information to overcome this problem.

Thanks in advance. CANU_error_1605209.pdf

skoren commented 5 years ago

Not much to add on top of the error message, you ran out of disk space quite early in the assembly process. You have about 12gb of data and <100gb of disk to assemble that in. This isn't going to be enough, you'll need to find a server/partition with more space. The FAQ also has some suggestions for reducing disk space but that will probably hurt the continuity of your metagenomic assembly.

apoosakkannu commented 5 years ago

Thanks for your kind reply. Could you give me idea how much disk space i need to have for 12gb of data?

skoren commented 5 years ago

It's hard to predict for a meta genome since your disk space will depend on the metagenome complexity. A well-behaved human takes on the order or 2tb so I'd start with that and see how it does.

apoosakkannu commented 5 years ago

Hi, again canuerror_08062019.docx I got disk space problem. I am not sure what to do in this case? I have attached the error file for your reference. Could you give me some idea how to overcome this problem?

skoren commented 5 years ago

I'm not sure this is disk space or some other issue with your system, either way it seems to be coming from outside Canu. The last reported free space from the run said you had about 60tb free so I doubt all of that got filled up. Somewhere in the middle of the run, your system reported:

Key has expired

I've personally never seen this error but searching it on google it seems to be a mount point error, that is your filesystem disconnected you and thus you have no space. You'll have to work out why this happened with your system support.

You can confirm how much space Canu is actually using by running du on the assembly folder (du --max-depth 1 -h /auto/plzen1/home/apoosakkannu/polyplax_results_canu). I expect it will be much less than the available space. I'd also suggest removing everything in /auto/plzen1/home/apoosakkannu/polyplax_results_canu/correction/1-overlapper/results/* since I don't trust that any of the intermediate output is not corrupted by the disk failure.

apoosakkannu commented 5 years ago

I got the following error,

-- /afs/ics.muni.cz/software/canu/1.8+git/bin/sqStoreCreate: /usr/local/lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /afs/ics.muni.cz/software/canu/1.8+git/bin/sqStoreCreate)
--
-- CITATIONS
--
-- Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM.
-- Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.
-- Genome Res. 2017 May;27(5):722-736.
-- http://doi.org/10.1101/gr.215087.116
-- 
-- Koren S, Rhie A, Walenz BP, Dilthey AT, Bickhart DM, Kingan SB, Hiendleder S, Williams JL, Smith TPL, Phillippy AM.
-- De novo assembly of haplotype-resolved genomes with trio binning.
-- Nat Biotechnol. 2018
-- https//doi.org/10.1038/nbt.4277
-- 
-- Read and contig alignments during correction, consensus and GFA building use:
--   Šošic M, Šikic M.
--   Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
--   Bioinformatics. 2017 May 1;33(9):1394-1395.
--   http://doi.org/10.1093/bioinformatics/btw753
-- 
-- Overlaps are generated using:
--   Berlin K, et al.
--   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
--   Nat Biotechnol. 2015 Jun;33(6):623-30.
--   http://doi.org/10.1038/nbt.3238
-- 
--   Myers EW, et al.
--   A Whole-Genome Assembly of Drosophila.
--   Science. 2000 Mar 24;287(5461):2196-204.
--   http://doi.org/10.1126/science.287.5461.2196
-- 
-- Corrected read consensus sequences are generated using an algorithm derived from FALCON-sense:
--   Chin CS, et al.
--   Phased diploid genome assembly with single-molecule real-time sequencing.
--   Nat Methods. 2016 Dec;13(12):1050-1054.
--   http://doi.org/10.1038/nmeth.4035
-- 
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
--   Chin CS, et al.
--   Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
--   Nat Methods. 2013 Jun;10(6):563-9
--   http://doi.org/10.1038/nmeth.2474
-- 
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '1.8.0_60' (from '/packages/run/jdk-8/current/bin/java') with -d64 support.
-- Detected gnuplot version '4.6 patchlevel 4   ' (from 'gnuplot') and image format 'png'.
-- Detected 276 CPUs and 5614 gigabytes of memory.
-- Detected PBSPro '19.0.0' with 'pbsnodes' binary in /opt/pbs/bin/pbsnodes.
-- Grid engine disabled per useGrid=false option.
--
--                            (tag)Concurrency
--                     (tag)Threads          |
--            (tag)Memory         |          |
--        (tag)         |         |          |     total usage     algorithm
--        -------  ------  --------   --------  -----------------  -----------------------------
-- Local: meryl     24 GB    6 CPUs x  46 jobs  1104 GB  276 CPUs  (k-mer counting)
-- Local: hap       12 GB   23 CPUs x  12 jobs   144 GB  276 CPUs  (read-to-haplotype assignment)
-- Local: cormhap   13 GB   12 CPUs x  23 jobs   299 GB  276 CPUs  (overlap detection with mhap)
-- Local: obtovl     8 GB    6 CPUs x  46 jobs   368 GB  276 CPUs  (overlap detection)
-- Local: utgovl     8 GB    6 CPUs x  46 jobs   368 GB  276 CPUs  (overlap detection)
-- Local: cor       16 GB    4 CPUs x  69 jobs  1104 GB  276 CPUs  (read correction)
-- Local: ovb        4 GB    1 CPU  x 276 jobs  1104 GB  276 CPUs  (overlap store bucketizer)
-- Local: ovs        8 GB    1 CPU  x 276 jobs  2208 GB  276 CPUs  (overlap store sorting)
-- Local: red       10 GB    6 CPUs x  46 jobs   460 GB  276 CPUs  (read error detection)
-- Local: oea        4 GB    1 CPU  x 276 jobs  1104 GB  276 CPUs  (overlap error adjustment)
-- Local: bat       64 GB    8 CPUs x   1 job     64 GB    8 CPUs  (contig construction with bogart)
-- Local: cns      --- GB    8 CPUs x --- jobs   --- GB  --- CPUs  (consensus)
-- Local: gfa       16 GB    8 CPUs x   1 job     16 GB    8 CPUs  (GFA alignment and processing)
--
-- Found Nanopore uncorrected reads in the input files.
--
-- Generating assembly 'assembly' in '/scratch/apoosakkannu/job_2774628.wagap-pro.cerit-sc.cz/results'
--
-- Parameters:
--
--  genomeSize        110000000
--
--  Overlap Generation Limits:
--    corOvlErrorRate 0.3200 ( 32.00%)
--    obtOvlErrorRate 0.1200 ( 12.00%)
--    utgOvlErrorRate 0.1200 ( 12.00%)
--
--  Overlap Processing Limits:
--    corErrorRate    0.5000 ( 50.00%)
--    obtErrorRate    0.1200 ( 12.00%)
--    utgErrorRate    0.1200 ( 12.00%)
--    cnsErrorRate    0.2000 ( 20.00%)
--
--
-- BEGIN CORRECTION
--
----------------------------------------
-- Starting command on Thu Jun 13 10:53:12 2019 with 41921.916 GB free disk space

    cd .
    /afs/ics.muni.cz/software/canu/1.8+git/bin/sqStoreCreate \
      -o ./assembly.seqStore.BUILDING \
      -minlength 1000 \
      ./assembly.seqStore.ssi \
    > ./assembly.seqStore.BUILDING.err 2>&1

-- Finished on Thu Jun 13 10:53:12 2019 (lickety-split) with 41921.916 GB free disk space
----------------------------------------

ERROR:
ERROR:  Failed with exit code 1.  (rc=256)
ERROR:

ABORT:
ABORT: /afs/ics.muni.cz/software/canu/1.8+git/bin/sqStoreCreate: /usr/local/lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /afs/ics.muni.cz/software/canu/1.8+git/bin/sqStoreCreate)
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting.  If that doesn't work, ask for help.
ABORT:
ABORT:   sqStoreCreate failed.
ABORT:
ABORT: Disk space available:  41921.916 GB
ABORT:
ABORT: Last 50 lines of the relevant log file (./assembly.seqStore.BUILDING.err):
ABORT:
ABORT:   /afs/ics.muni.cz/software/canu/1.8+git/bin/sqStoreCreate: /usr/local/lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /afs/ics.muni.cz/software/canu/1.8+git/bin/sqStoreCreate)
ABORT:

What could be done to get rid of it?

skoren commented 5 years ago

This is an issue with how you compiled Canu, the environment on the machine where you compiled it doesn't match the environment where you're trying to run it. If you use the released binary you won't have this issue or make sure the environments match.