marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
660 stars 179 forks source link

Bogart BestOverlapGraph::removeLopsidedEdges #2292

Closed alpapan closed 8 months ago

alpapan commented 9 months ago

Greetings

Any suggestions how to troubleshoot this?

(Using minimap2):

canu -pacbio-hifi canu3-haplotypeGRACE.trimmedReads.fasta.gz -p GRACE  -d canu3_GRACE minReadLength=700 minOverlapLength=400 overlapper=minimap useGrid=false   genomeSize=380m minInputCoverage=8.5 stopOnLowCoverage=8.5 
BestOverlapGraph()-- Filtering reads with lopsided best edges (more than 25.00% different).                                                     

WARNING: read 1625311 5' has overlap to spur read 935042 3'!            

bogart: bogart/AS_BAT_BestOverlapGraph.C:554: void 
BestOverlapGraph::removeLopsidedEdges(const char*, const char*, double): 
Assertion `(this5->isUnset() == true) || (back5->isValid() == true)' failed.                

Failed with 'Aborted';
 backtrace (libbacktrace):                        
utility/src/utility/system-stackTrace.C::82 in _Z17AS_UTL_catchCrashiP9siginfo_tPv()                                                            
(null)::0 in (null)()                                                   
(null)::0 in (null)()                                                   
(null)::0 in (null)()                                                   
(null)::0 in (null)()                                                   
(null)::0 in (null)()

bogart/AS_BAT_BestOverlapGraph.C::554 in _ZN16BestOverlapGraph19removeLopsidedEdgesEPKcS1_d._omp_fn.0()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
Aborted (core dumped)
skoren commented 9 months ago

The parameters you've set would only compute approximate overlaps which is not supported. You can use either the -fast option or set both overlapper=minimap and utgReAlign=true. In general, we don't recommend switching the overlappers from the default as that has undergone minimal testing.

I'm not sure what kind of input data you have but I also wouldn't use minimap for HiFi data, it hasn't been optimized or tested for that use case in canu. You also usually don't need trimming for HiFi data before assembling it.