marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
649 stars 178 forks source link

Not enough memory to load the minimum number of overlaps; increase -M. #1650

Closed bioramg closed 4 years ago

bioramg commented 4 years ago

Hi, I am trying to assemble the mitochondrial genome from whole-genome sequencing reads. The ONT nanopore read size is ~8.8 GB. But i don't know my mitochondrial genome size. You already suggested giving whole genome size. But I don't know. How to calculate.

Also, I tried to run this assembly and got this error. could you please suggest.

Thank you.

-- Running jobs.  Second attempt out of 2.
----------------------------------------
-- Starting 'bat' concurrent execution on Sat Mar 21 03:01:23 2020 with 512.498 GB free disk space (1 processes; 1 concurrently)

    cd unitigging/4-unitigger
    ./unitigger.sh 1 > ./unitigger.000001.out 2>&1

-- Finished on Sat Mar 21 03:01:23 2020 (fast as lightning) with 512.498 GB free disk space
----------------------------------------
--
-- Bogart failed, tried 2 times, giving up.
--

ABORT:
ABORT: Canu 1.9
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting.  If that doesn't work, ask for help.
ABORT:
ABORT: Disk space available:  512.498 GB
ABORT:
ABORT: Last 50 lines of the relevant log file (unitigging/4-unitigger/unitigger.err):
ABORT:
ABORT:
ABORT:   Lengths:
ABORT:     Minimum read          0 bases
ABORT:     Minimum overlap       500 bases
ABORT:
ABORT:   Overlap Error Rates:
ABORT:     Graph                 0.120 (12.000%)
ABORT:     Max                   0.120 (12.000%)
ABORT:
ABORT:   Deviations:
ABORT:     Graph                 12.000
ABORT:     Bubble                12.000
ABORT:     Repeat                6.000
ABORT:
ABORT:   Edge Confusion:
ABORT:     Absolute              2100
ABORT:     Percent               200.0000
ABORT:
ABORT:   Unitig Construction:
ABORT:     Minimum intersection  500 bases
ABORT:     Maxiumum placements   2 positions
ABORT:
ABORT:   Debugging Enabled:
ABORT:     (none)
ABORT:
ABORT:   ==> LOADING AND FILTERING OVERLAPS.
ABORT:
ABORT:   ReadInfo()-- Using 816662 reads, no minimum read length used.
ABORT:
ABORT:   OverlapCache()-- limited to 16384MB memory (user supplied).
ABORT:
ABORT:   OverlapCache()--       6MB for read data.
ABORT:   OverlapCache()--      31MB for best edges.
ABORT:   OverlapCache()--      80MB for tigs.
ABORT:   OverlapCache()--      21MB for tigs - read layouts.
ABORT:   OverlapCache()--      31MB for tigs - error profiles.
ABORT:   OverlapCache()--    4096MB for tigs - error profile overlaps.
ABORT:   OverlapCache()--       0MB for other processes.
ABORT:   OverlapCache()-- ---------
ABORT:   OverlapCache()--    4282MB for data structures (sum of above).
ABORT:   OverlapCache()-- ---------
ABORT:   OverlapCache()--      15MB for overlap store structure.
ABORT:   OverlapCache()--   12085MB for overlap data.
ABORT:   OverlapCache()-- ---------
ABORT:   OverlapCache()--   16384MB allowed.
ABORT:   OverlapCache()--
ABORT:   OverlapCache()-- Retain at least 1010 overlaps/read, based on 505.19x coverage.
ABORT:   OverlapCache()-- Initial guess at 969 overlaps/read.
ABORT:   OverlapCache()--
ABORT:   OverlapCache()-- Not enough memory to load the minimum number of overlaps; increase -M.
ABORT:
skoren commented 4 years ago

You don't need an exact genome size, just a guestimate is fine.

The error is the same as #1645, you have really high coverage of your genome which requires more memory for enough overlaps to be loaded. You can either increase either batMemory or the genome size (make it 100mb). First remove the unitigging/4-unitigger, then re-run the canu command adding either batMemory=128 (or whatever you have available) or genomeSize=100m.

bioramg commented 4 years ago

Thanks for your valuable suggestions. I will try again and let you know. Thank you once again.

bioramg commented 4 years ago

Hi As per your suggestion, I deleted unitigging folder and could not find 4-unitigger folder.

I have given the following commands: bin$./canu -p ONT -d ONT genomeSize=100m batMemory=188 -nanopore-raw ONT.fastq.gz

batMemory=188 (Its available RAM memory)?

But, I have gotten the error again:

----------------------------------------
-- Starting command on Sat Mar 21 13:01:28 2020 with 507.797 GB free disk space

    cd correction
    /home/pmslab/Desktop/Raman/bin/bin/canu-1.9/Linux-amd64/bin/ovStoreConfig \
     -S ../Convallaria_ONT.seqStore \
     -M 4-8 \
     -L ./1-overlapper/ovljob.files \
     -create ./Convallaria_ONT.ovlStore.config \
     > ./Convallaria_ONT.ovlStore.config.txt \
    2> ./Convallaria_ONT.ovlStore.config.err

-- Finished on Sat Mar 21 13:01:31 2020 (3 seconds) with 507.796 GB free disk space
----------------------------------------

ERROR:
ERROR:  Failed with exit code 139.  (rc=35584)
ERROR:

ABORT:
ABORT: Canu 1.9
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting.  If that doesn't work, ask for help.
ABORT:
ABORT:   failed to configure the overlap store.
ABORT:
ABORT: Disk space available:  507.796 GB
ABORT:
ABORT: Last 50 lines of the relevant log file (correction/Convallaria_ONT.ovlStore.config.err):
ABORT:
ABORT:
ABORT:   Finding number of overlaps per read and per file.
ABORT:
ABORT:      Moverlaps
ABORT:   ------------ ----------------------------------------
ABORT:
ABORT:   Failed with 'Segmentation fault'; backtrace (libbacktrace):
ABORT:   utility/system-stackTrace.C::89 in _Z17AS_UTL_catchCrashiP7siginfoPv()
ABORT:   (null)::0 in (null)()
ABORT:   stores/ovStoreFile.H::191 in _ZN9ovFileOCR11numOverlapsEj()
ABORT:   stores/ovStoreConfig.C::72 in _ZN13ovStoreConfig19assignReadsToSlicesEP7sqStoremm()
ABORT:   stores/ovStoreConfig.C::441 in main()
ABORT:   (null)::0 in (null)()
ABORT:   (null)::0 in (null)()
ABORT:   Segmentation fault (core dumped)
ABORT:
skoren commented 4 years ago

That's a very different error, and from a step that shouldn't need to re-run if you just removed the unitigging/4-unitigging folder. The files you removed corrupted the run so I'd suggest removing the full folder and re-running from scratch with the command:

canu -p ONT -d ONT genomeSize=2m batMemory=188 -nanopore-raw ONT.fastq.gz  
bioramg commented 4 years ago

Thank you for your suggestion. But, the genome size 2mb is very small? Is it enough? Because, the ONT read contains all chloroplast, mitochondrial and nuclear genomes. I need to assembly mitochondrial genome only. So, Please suggest to me.

Thank you.

skoren commented 4 years ago

Well as I said in your previous issue, there is no way to assemble the mitochondrial genome only if you're providing whole genome data. You'd have to either recruit reads from the mitochondria somehow or run using the full genome size of your genome.

My command was just reproducing your previous attempt, where you said you were using an approximately 1.2mb genome size. I'd certainly say run with a large genome size but it will take longer to run.

bioramg commented 4 years ago

Ok thank you.