marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
654 stars 179 forks source link

Assertion `tigend > tigbgn' failed in consensus calling step #1878

Closed milanvanhoek closed 3 years ago

milanvanhoek commented 3 years ago

Hi!

I am performing a large assembly, but I get the following assert error in the consensus step:

alignEdLib()-- WARNING: updated tigbgn 0 > tigend -418 - tiglen 74691 utgpos -1920--418 padding 76
alignEdLib()-- WARNING: tigbgn 0 > tigend -413 - tiglen 74691 utgpos -2079--493 padding 80
alignEdLib()-- WARNING: updated tigbgn 0 > tigend -493 - tiglen 74691 utgpos -2079--493 padding 80
utgcns: utgcns/unitigConsensus.C:677: bool alignEdLib(dagAlignment&, tgPosition&, char*, uint32, char*, uint32, double, bool): Assertion `tigend > tigbgn' failed.
utgcns: utgcns/unitigConsensus.C:677: bool alignEdLib(dagAlignment&, tgPosition&, char*, uint32, char*, uint32, double, bool): Assertion `tigend > tigbgn' failed.

Failed with 'Aborted
Failed with ''; backtrace (libbacktrace):
Aborted'; backtrace (libbacktrace):

Failed with 'Segmentation fault'; backtrace (libbacktrace):
utility/src/utility/system-stackTrace.C::83 in _Z17AS_UTL_catchCrashiP9siginfo_tPv()
(null)::0 in (null)()
utility/src/utility/system-stackTrace.C::83 in _Z17AS_UTL_catchCrashiP9siginfo_tPv()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
(null)::0 in (null)()
/opt/sge/default/spool/execd/cl4node018/job_scripts/4564204: line 93:  4698 Segmentation fault      $bin/utgcns -R ../test.ctgStore/partition.$jobid -T ../test.ctgStore 1 -P $jobid -O ./ctgcns/$jobid.cns.WORKING -maxcoverage 40 -e 0.2 -pbdagcon -edlib -threads 8

Found perl:
   /usr/bin/perl
   This is perl 5, version 26, subversion 1 (v5.26.1) built for x86_64-linux-gnu-thread-multi

Found java:
   /home/user/tools/bin/java
   java version "1.8.0_181"

Found canu:
   canu
   canu snapshot v2.2-development +82 changes (r10191 af771ef2c17d7d8e639d9c9d4897a7f609b5eb55)

As you can see in the error message, I am running the latest version, because I had noticed a recent fix in the consensus calling step, but that unfortuantely didn't solve my issue.

I ran canu like this:

canu -p test-d 104x genomeSize=3500M -nanopore-raw raw/*.fastq.gz -useGrid=true gridEngineResourceOption="-pe smp THREADS -l mem_total=MEMORY" 

on a Linux cluster. Any ideas about what I could do to get around this?

skoren commented 3 years ago

Interesting, it seems the read in question landed completely outside consensus, that's a bit weird. Try changing tigend = utgpos.max(); to tigend = (utgpos.max() < 0 ? tigend : utgpos.max()). There should be two places, 673 and 717 and see if it runs.

milanvanhoek commented 3 years ago

Thanks for your quick response, I'll try it out and let you know

milanvanhoek commented 3 years ago

I reran canu with your suggested changes, but now i get this:

alignEdLib()-- WARNING: tigbgn 0 > tigend -413 - tiglen 74691 utgpos -2079--493 padding 80 alignEdLib()-- WARNING: updated tigbgn 0 > tigend -413 - tiglen 74691 utgpos -2079--493 padding 80 alignEdLib()-- WARNING: tigbgn 0 > tigend -342 - tiglen 74691 utgpos -1920--418 padding 76 alignEdLib()-- WARNING: updated tigbgn 0 > tigend -342 - tiglen 74691 utgpos -1920--418 padding 76 utgcns: utgcns/unitigConsensus.C:677: bool alignEdLib(dagAlignment&, tgPosition&, char, uint32, char, uint32, double, bool): Assertion tigend > tigbgn' failed. utgcns: utgcns/unitigConsensus.C:677: bool alignEdLib(dagAlignment&, tgPosition&, char, uint32, char, uint32, double, bool): Assertion tigend > tigbgn' failed.

Failed with 'Aborted'; backtrace (libbacktrace):

Now I think about this it makes sense because apparently tigend was negative to start with and is not updated anymore (because utgpos.max() is also negative), so it remains negative. Any other ideas? Thanks for your help!

brianwalenz commented 3 years ago

Any chance you can export the data and send it to us?

utgcns -S ../../test.seqStore -T ../test.ctgStore 1 -tig <tigID> -export badTig.tig

Also check that the input layout is sane:

tgStoreDump -S ../../test.seqStore -T ../test.ctgStore 1 -tig <tigID> -layout
milanvanhoek commented 3 years ago

I am afraid that won't be possible, because of confidentiality issues. I could send you the output of the second command, which is a bit too long to post here I think. Would it be possible to skip this contig in consensus calling, it is pretty small for that matter and I could always use other consensus callers to fix that.

skoren commented 3 years ago

It doesn't matter if tigend is negative, it gets updated anyway. It looks like my patch had a mistake, it should be: tigend = (utgpos.max() < 0 ? tiglen : utgpos.max()) This essentially means the read can go anywhere in the contig because its given coordinates were inaccurate.

milanvanhoek commented 3 years ago

That last fix solves my issue it seems. Now I only get these warnings:

alignEdLib()-- WARNING: tigbgn 0 > tigend -413 - tiglen 74691 utgpos -2079--493 padding 80 alignEdLib()-- WARNING: updated tigbgn 0 > tigend 74691 - tiglen 74691 utgpos -2079--493 padding 80 alignEdLib()-- WARNING: tigbgn 0 > tigend -19 - tiglen 74691 utgpos -1312--81 padding 62 alignEdLib()-- WARNING: updated tigbgn 0 > tigend 74691 - tiglen 74691 utgpos -1312--81 padding 62 alignEdLib()-- WARNING: tigbgn 0 > tigend -342 - tiglen 74691 utgpos -1920--418 padding 76 alignEdLib()-- WARNING: updated tigbgn 0 > tigend 74691 - tiglen 74691 utgpos -1920--418 padding 76 alignEdLib()-- WARNING: tigbgn 0 > tigend -19 - tiglen 74691 utgpos -1312--81 padding 62 alignEdLib()-- WARNING: updated tigbgn 0 > tigend 74691 - tiglen 74691 utgpos -1312--81 padding 62 alignEdLib()-- WARNING: tigbgn 0 > tigend -413 - tiglen 74691 utgpos -2079--493 padding 80 alignEdLib()-- WARNING: updated tigbgn 0 > tigend 74691 - tiglen 74691 utgpos -2079--493 padding 80 alignEdLib()-- WARNING: tigbgn 0 > tigend -342 - tiglen 74691 utgpos -1920--418 padding 76 alignEdLib()-- WARNING: updated tigbgn 0 > tigend 74691 - tiglen 74691 utgpos -1920--418 padding 76 alignEdLib()-- WARNING: tigbgn 0 > tigend -413 - tiglen 74691 utgpos -2079--493 padding 80 alignEdLib()-- WARNING: updated tigbgn 0 > tigend 74691 - tiglen 74691 utgpos -2079--493 padding 80

but it runs to completion, thanks!