Closed ovidp closed 6 years ago
Whoops. It's an easy fix, but DO NOT 'git update' your code. The on-disk data structures have changed since you started this assembly.
Instead, edit src/bogart/AS_BAT_BestOverlapGraph.C and delete line 416, the middle line ("writeLog(...)") below:
if (fi < nc) { // If we're smaller, we're a
#pragma omp critical (suspInsert) // Zombie Master!
writeLog("read %u is a zombie.\n", fi);
_zombie.insert(fi);
}
Recompile and then restart canu.
Thank you for your fast answer. However canu fails again.
cat unitigging/4-unitigger/unitigger.err
outputs now:
==> PARAMETERS.
Resources:
Memory 189 GB
Compute Threads 16 (command line)
Lengths:
Minimum read 0 bases
Minimum overlap 500 bases
Overlap Error Rates:
Graph 0.105 (10.500%)
Max 0.105 (10.500%)
Deviations:
Graph 6.000
Bubble 6.000
Repeat 3.000
Edge Confusion:
Absolute 2100
Percent 200.0000
Unitig Construction:
Minimum intersection 500 bases
Maxiumum placements 2 positions
Debugging Enabled:
(none)
==> LOADING AND FILTERING OVERLAPS.
ReadInfo()-- Using 2138563 reads, no minimum read length used.
OverlapCache()-- limited to 193536MB memory (user supplied).
OverlapCache()-- 16MB for read data.
OverlapCache()-- 81MB for best edges.
OverlapCache()-- 212MB for tigs.
OverlapCache()-- 57MB for tigs - read layouts.
OverlapCache()-- 81MB for tigs - error profiles.
OverlapCache()-- 48384MB for tigs - error profile overlaps.
OverlapCache()-- 0MB for other processes.
OverlapCache()-- ---------
OverlapCache()-- 48873MB for data structures (sum of above).
OverlapCache()-- ---------
OverlapCache()-- 40MB for overlap store structure.
OverlapCache()-- 144621MB for overlap data.
OverlapCache()-- ---------
OverlapCache()-- 193536MB allowed.
OverlapCache()--
OverlapCache()-- Retain at least 22 overlaps/read, based on 11.28x coverage.
OverlapCache()-- Initial guess at 4431 overlaps/read.
OverlapCache()--
OverlapCache()-- Adjusting for sparse overlaps.
OverlapCache()--
OverlapCache()-- reads loading olaps olaps memory
OverlapCache()-- olaps/read all some loaded free
OverlapCache()-- ---------- ------- ------- ----------- ------- --------
OverlapCache()-- 4431 2129462 9101 140283711 89.10% 142481 MB
OverlapCache()-- 1030433 2138563 0 157439780 100.00% 142219 MB
OverlapCache()--
OverlapCache()-- Loading overlaps.
OverlapCache()--
OverlapCache()-- read from store saved in cache
OverlapCache()-- ------------ --------- ------------ ---------
OverlapCache()-- 29347933 (018.64%) 28982507 (018.41%)
OverlapCache()-- 58686018 (037.28%) 57964060 (036.82%)
OverlapCache()-- 86967419 (055.24%) 85908451 (054.57%)
OverlapCache()-- 114979131 (073.03%) 113584331 (072.14%)
OverlapCache()-- 143507013 (091.15%) 141800857 (090.07%)
OverlapCache()-- ------------ --------- ------------ ---------
OverlapCache()-- 157439780 (100.00%) 155583954 (098.82%)
OverlapCache()--
OverlapCache()-- Ignored 1191848 duplicate overlaps.
OverlapCache()--
OverlapCache()-- Symmetrizing overlaps.
OverlapCache()-- Finding missing twins.
OverlapCache()-- Found 119302 missing twins in 155583954 overlaps, 1620 are strong.
OverlapCache()-- Dropping weak non-twin overlaps; allocated 0 MB scratch space.
OverlapCache()-- Dropped 3422 overlaps; scratch space released.
OverlapCache()-- Adding 115880 missing twin overlaps.
OverlapCache()-- Finished.
BestOverlapGraph()-- allocating best edges (65MB)
BestOverlapGraph()-- finding initial best edges.
BestOverlapGraph()-- filtering suspicious reads.
BestOverlapGraph()-- marked 1626591 reads as suspicious.
BestOverlapGraph()-- filtering high error edges.
BestOverlapGraph()-- filtering reads with lopsided best edges.
BestOverlapGraph()-- filtering spur reads.
BestOverlapGraph()-- detected 276777 spur reads and 1704959 singleton reads.
BestOverlapGraph()-- detected 68625 zombie reads.
BestOverlapGraph()-- removing best edges for contained reads.
==> BUILDING GREEDY TIGS.
breakSingletonTigs()-- Removed 298601 singleton tigs; reads are now unplaced.
optimizePositions()-- Optimizing read positions for 2138564 reads in 370120 tigs, with 16 threads.
optimizePositions()-- Allocating scratch space for 2138564 reads (133660 KB).
optimizePositions()-- Initializing positions with 16 threads.
optimizePositions()-- Recomputing positions, iteration 1, with 16 threads.
optimizePositions()-- Reset zero.
optimizePositions()-- Checking convergence.
optimizePositions()-- converged: 2135017 reads
optimizePositions()-- changed: 3547 reads
optimizePositions()-- Recomputing positions, iteration 2, with 16 threads.
optimizePositions()-- Reset zero.
optimizePositions()-- Checking convergence.
optimizePositions()-- converged: 2135354 reads
optimizePositions()-- changed: 3210 reads
optimizePositions()-- Recomputing positions, iteration 3, with 16 threads.
optimizePositions()-- Reset zero.
optimizePositions()-- Checking convergence.
optimizePositions()-- converged: 2135601 reads
optimizePositions()-- changed: 2963 reads
optimizePositions()-- Recomputing positions, iteration 4, with 16 threads.
optimizePositions()-- Reset zero.
optimizePositions()-- Checking convergence.
optimizePositions()-- converged: 2135695 reads
optimizePositions()-- changed: 2869 reads
optimizePositions()-- Recomputing positions, iteration 5, with 16 threads.
optimizePositions()-- Reset zero.
optimizePositions()-- Checking convergence.
optimizePositions()-- converged: 2135731 reads
optimizePositions()-- changed: 2833 reads
optimizePositions()-- Expanding short reads with 16 threads.
optimizePositions()-- Updating positions.
optimizePositions()-- Finished.
==> PLACE CONTAINED READS.
computeErrorProfiles()-- Computing error profiles for 370120 tigs, with 16 threads.
computeErrorProfiles()-- Finished.
placeContains()-- placing 119195 contained and 1943926 unplaced reads, with 16 threads.
placeContains()-- Placed 78262 contained reads and 98 unplaced reads.
placeContains()-- Failed to place 40933 contained reads (too high error suspected) and 1943828 unplaced reads (lack of overlaps suspected).
optimizePositions()-- Optimizing read positions for 2138564 reads in 370120 tigs, with 16 threads.
optimizePositions()-- Allocating scratch space for 2138564 reads (133660 KB).
optimizePositions()-- Initializing positions with 16 threads.
bogart: bogart/AS_BAT_OptimizePositions.C:142: void Unitig::optimize_initPlace(uint32, optPos*, optPos*, bool, std::set<unsigned int>&, bool): Assertion `cnt > 0' failed.
Failed with 'Aborted'; backtrace (libbacktrace):
AS_UTL/AS_UTL_stackTrace.C::97 in _Z17AS_UTL_catchCrashiP9siginfo_tPv()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
bogart/AS_BAT_OptimizePositions.C::142 in _ZN6Unitig18optimize_initPlaceEjP6optPosS1_bRSt3setIjSt4lessIjESaIjEEb()
bogart/AS_BAT_OptimizePositions.C::393 in _ZN9TigVector17optimizePositionsEPKcS1_._omp_fn.0()
../../../libgomp/team.c::116 in gomp_thread_start()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
and the log file:
-- Canu snapshot v1.7 +23 changes (r8715 967fcea3c70699eaccc92ff5bfe36d9d10e65a55)
--
-- CITATIONS
--
-- Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM.
-- Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation.
-- Genome Res. 2017 May;27(5):722-736.
-- http://doi.org/10.1101/gr.215087.116
--
-- Read and contig alignments during correction, consensus and GFA building use:
-- Šošic M, Šikic M.
-- Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.
-- Bioinformatics. 2017 May 1;33(9):1394-1395.
-- http://doi.org/10.1093/bioinformatics/btw753
--
-- Overlaps are generated using:
-- Berlin K, et al.
-- Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
-- Nat Biotechnol. 2015 Jun;33(6):623-30.
-- http://doi.org/10.1038/nbt.3238
--
-- Myers EW, et al.
-- A Whole-Genome Assembly of Drosophila.
-- Science. 2000 Mar 24;287(5461):2196-204.
-- http://doi.org/10.1126/science.287.5461.2196
--
-- Li H.
-- Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences.
-- Bioinformatics. 2016 Jul 15;32(14):2103-10.
-- http://doi.org/10.1093/bioinformatics/btw152
--
-- Corrected read consensus sequences are generated using an algorithm derived from FALCON-sense:
-- Chin CS, et al.
-- Phased diploid genome assembly with single-molecule real-time sequencing.
-- Nat Methods. 2016 Dec;13(12):1050-1054.
-- http://doi.org/10.1038/nmeth.4035
--
-- Contig consensus sequences are generated using an algorithm derived from pbdagcon:
-- Chin CS, et al.
-- Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.
-- Nat Methods. 2013 Jun;10(6):563-9
-- http://doi.org/10.1038/nmeth.2474
--
-- CONFIGURE CANU
--
-- Detected Java(TM) Runtime Environment '1.8.0_92' (from '/net/gmi.oeaw.ac.at/software/mendel/intel-x86_64-sandybridge-avx/software/Java/1.8.0_92/bin/java').
-- Detected gnuplot version '4.6 patchlevel 0' (from 'gnuplot') and image format 'svg'.
-- Detected 48 CPUs and 220 gigabytes of memory.
-- No grid engine detected, grid disabled.
--
-- (tag)Concurrency
-- (tag)Threads |
-- (tag)Memory | |
-- (tag) | | | total usage algorithm
-- ------- ------ -------- -------- ----------------- -----------------------------
-- Local: meryl 220 GB 32 CPUs x 1 job 220 GB 32 CPUs (k-mer counting)
-- Local: cormhap 32 GB 16 CPUs x 3 jobs 96 GB 48 CPUs (overlap detection with mhap)
-- Local: obtovl 16 GB 16 CPUs x 3 jobs 48 GB 48 CPUs (overlap detection)
-- Local: utgovl 16 GB 16 CPUs x 3 jobs 48 GB 48 CPUs (overlap detection)
-- Local: ovb 4 GB 1 CPU x 48 jobs 192 GB 48 CPUs (overlap store bucketizer)
-- Local: ovs 32 GB 1 CPU x 6 jobs 192 GB 6 CPUs (overlap store sorting)
-- Local: red 8 GB 4 CPUs x 12 jobs 96 GB 48 CPUs (read error detection)
-- Local: oea 4 GB 1 CPU x 48 jobs 192 GB 48 CPUs (overlap error adjustment)
-- Local: bat 220 GB 16 CPUs x 1 job 220 GB 16 CPUs (contig construction)
-- Local: gfa 16 GB 16 CPUs x 1 job 16 GB 16 CPUs (GFA alignment and processing)
--
-- In 'HEcanuCorrLo.gkpStore', found PacBio reads:
-- Raw: 0
-- Corrected: 2138563
-- Trimmed: 2138563
--
-- Generating assembly 'HEcanuCorrLo' in '/lustre/scratch/users/ovidiu.paun/PacBio'
--
-- Parameters:
--
-- genomeSize 1300000000
--
-- Overlap Generation Limits:
-- corOvlErrorRate 0.2400 ( 24.00%)
-- obtOvlErrorRate 0.1050 ( 10.50%)
-- utgOvlErrorRate 0.1050 ( 10.50%)
--
-- Overlap Processing Limits:
-- corErrorRate 0.3000 ( 30.00%)
-- obtErrorRate 0.1050 ( 10.50%)
-- utgErrorRate 0.1050 ( 10.50%)
-- cnsErrorRate 0.1050 ( 10.50%)
--
--
-- BEGIN ASSEMBLY
--
--
-- Running jobs. First attempt out of 2.
----------------------------------------
-- Starting 'bat' concurrent execution on Thu Mar 29 15:24:31 2018 with 12708.892 GB free disk space (1 processes; 1 concurrently)
cd unitigging/4-unitigger
./unitigger.sh 1 > ./unitigger.000001.out 2>&1
-- Finished on Thu Mar 29 15:25:27 2018 (56 seconds) with 12703.78 GB free disk space
----------------------------------------
--
-- Bogart failed, retry
--
--
-- Running jobs. Second attempt out of 2.
----------------------------------------
-- Starting 'bat' concurrent execution on Thu Mar 29 15:25:27 2018 with 12703.78 GB free disk space (1 processes; 1 concurrently)
cd unitigging/4-unitigger
./unitigger.sh 1 > ./unitigger.000001.out 2>&1
-- Finished on Thu Mar 29 15:26:18 2018 (51 seconds) with 12696.698 GB free disk space
----------------------------------------
--
-- Bogart failed, tried 2 times, giving up.
--
ABORT:
ABORT: Canu snapshot v1.7 +23 changes (r8715 967fcea3c70699eaccc92ff5bfe36d9d10e65a55)
ABORT: Don't panic, but a mostly harmless error occurred and Canu stopped.
ABORT: Try restarting. If that doesn't work, ask for help.
ABORT:
Thank you again for your help!
That one is a bit harder, and I'm distressed it's still failing. I thought I fixed it (see #718 and #546 for other crashes).
Any chance you can upload unitigging/.gkpStore and unitigging/.ovlStore (unitigging/4-unitigger would be helpful, but not strictly necessary) so I can debug?
I also just noticed you've only got 11x of reads - is that 11x of raw uncorrected reads, or 11x of corrected reads? How were these corrected? Some hints are under 'low coverage' in http://canu.readthedocs.io/en/latest/faq.html.
Looking at the overlap report (search for "Overlap store 'unitigging/HEcanuCorrLo.ovlStore' contains") there isn't much of anything to assemble here. Worse, it's also reporting:
-- Overlap store 'unitigging/HEcanuCorrLo.ovlStore' successfully constructed.
-- Found 157439780 overlaps for 550391 reads; 1588172 reads have no overlaps.
so most of your reads aren't getting used at all. I think all you've got here are the repeats. :-(
I'd still be interested in debugging the crash, if you're able to upload the data.
Thank you again for responding so fast. While I am uploading the files, I wanted to make sure you mean the .ovlStore and .gkpStore folders, right? I am currently uploading them to ftp://ftp.cbcb.umd.edu/incoming/sergek, but they are quite large files even in tar.gz state. Or did you mean the gkpStore.err instead?
To answer your other questions: Yes, I have only 11x coverage of the genome with PacBio, but also 120x Illumina reads. This assembly I am trying here is using lordec to correct the PacBio reads with Illumina, and then assembling them with canu. I am also trying separately to assemble with lordec-corrected PacBio reads, by declaring them as uncorrected. I know the assembly will not be great, but it will be used to apply for funds to get more data.
If I'm understanding correctly, you have 11x of lordec corrected reads, and are running two assemblies with those reads, one using -pacbio-corrected (which crashed) and one pretending they're raw reads using -pacbio-raw. Great!
You can also try an assembly without trimming the lordec reads - "-assemble -pacbio-corrected reads.fasta". It could result in a better assembly as Canu will trim (or more likely, completely ignore) reads that have only overlaps on the ends.
Yes, the gkpStore (read info) and ovlStore (overlaps) are all I need to run the unitigger (bogart) over here. With that, I can poke around in the gory details and find the problem. I probably won't be able to do anything until Wednesday.
Hi. The crashed assembly was actually started the way you suggest now as -assembly -pacbio-corrected lordec_corrected_reads.fasta. I uploaded the files. Thanks again
Dear Brian
Can you please let me know what shall I do next regarding my segmentation fault problem? Shall I try to install a newer version of canu and rerun the entire analysis? Or are you still going to debug? Could you find the data I uploaded?
Thanks a lot
I'm finishing up the fix right now.
The fix will be a pair of files in src/bogart/. Unfortunately, you can't easily upgrade to the latest version of Canu, since on-disk data changed. Once I give it a couple more tests I'll post the files here.
It seems possible to upgrade your on-disk data to the current version.
Using your current binaries:
ovStoreDump -G HEcanuCorrLo.gkpStore -O HEcanuCorrLo.ovlStore -d -binary dump1
overlapConvert -G HEcanuCorrLo.gkpStore -raw dump1.ovb > dump1.raw
And then with the latest binaries:
overlapImport -G HEcanuCorrLo.gkpStore -raw -O new.ovlStore dump1.raw
This rewrites the overlaps from the old format into the new format. The output is a new ovlStore (creatively called 'new.ovlStore'). The gkpStore data format didn't change.
I was debugging a similar crash to yours, and thought I had the problem fixed. But your example still fails. There's no point in moving to the tip code yet; 'bogart' is still the same.
It might be possible to get around the problem by decreasing the allowed overlap error rate; decrease both the -tg and -eM values (by 0.1?) in unitigger.sh and run that script (./unitigger.sh 1).
Your data seems very noisy; the *.001.filterOverlaps.thr000.num000.log reports
ERROR RATES (658571 samples)
-----------
mean 0.08657178 stddev 0.01503193 -> 0.17676335 fraction error = 17.676335% error
median 0.08950000 mad 0.01020000 -> 0.18023512 fraction error = 18.023512% error
where the mean is usually around 0.01 or 0.02 and the final error is around 3% to 8%.
Well, I got it to run. But it didn't assemble. Only 968 contigs with total size 15 Mbp were output. About 15 Gbp of 'unassembled' pieces, most of these are singleton reads.
No patch to the code yet; I'm working out other issues still.
Here's a histogram of the error rates in overlaps. It looks like it's maybe truncated at the high end. It's also much higher than I'm comfortable assembling - any genome duplication(s) cannot be distinguished, repeats will get smashed together, etc, etc.
Dear Brian Thank you very much for proceeding with this. I guess the idea of correcting PacBio reads with Illumina with Lordec and then assembling the long reads directly it is a bad one. I have other 2 assemblies running with canu, declaring the lordec corrected reads as raw PacBio reads and one starting directly with the raw reads (and not taking the illumina reads into ccount at all). Hope those assemblies will turn better. Not sure if the segmentation fault was introduced in any way by lordec. Anyway, thank you very much again. Best wishes, Ovidiu
Thanks for sharing the data!
The algorithm that fails seems to be getting confused by repetitiveness of this data. This could be caused by lordec homogenizing repeats, or the high divergence in overlaps, or it could just be a property of your genome. I thought I had a fix, but am now back to rethinking the whole algorithm.
That was ugly, but I think I (finally) got it fixed.
Your data has been removed. It was on a disk that isn't backed up.
Hi I am running a canu (1.7) assembly of a plant genome (1.3gb) based on 11x PacBio reads. It all goes well till unitigging, when bogart fails with a segmentation fault. I tried restarting and then I start the entire process fresh, but I am getting the same error. I would greatly appreaciate some help.
I am running canu in a Linux cluster. The canu command used: '/canu/Linux-amd64/bin/canu -assemble -d /PacBio/ useGrid=false correctedErrorRate=0.105 -pacbio-corrected /PacBio/all_corr_pacbio.fasta genomeSize=1.3g -p HEcanuCorrLo'
The contents of
unitigging/4-unitigger/unitigger.err
:The contents of the 'log file':