ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
526 stars 111 forks source link

cactus-hal2maf issue at taffy norm ? #1522

Closed jeramiahsmith closed 3 days ago

jeramiahsmith commented 1 week ago

I am using cactus-hal2maf and the conversion seems to be failing at the stage where it is running taffy norm commands. I included a snippet of the stderr below, but please let me know if you need more.

  [2024-11-12T16:00:53-0500] [MainThread] [I] [toil-rt] Successfully ran hal2maf sal10.hal stdout --refGenome AM --refSequence Mex_chr10 --start 1712000000 --length 500000 --maxBlockLen 10000 --noAncestors in time: 0.20
    [2024-11-12T16:00:53-0500] [MainThread] [I] [toil-rt] Successfully ran hal2maf sal10.hal stdout --refGenome AM --refSequence Mex_chr10 --start 1712500000 --length 375843 --maxBlockLen 10000 --noAncestors in time: 0.10
    [2024-11-12T16:00:53-0500] [MainThread] [I] [toil-rt] First of 3426 commands in parallel batch: set -eo pipefail && cat salj2_chunk_0.maf | (time -p  mafRowOrderer -m - --order-file genome.list) 2> 0.sort.time | (time -p  taffy view ) 2> 0.m2t.time | (time -p  taffy norm -a sal10.hal -k ) 2> 0.tn.time > salj2_chunk_0.norm.maf && mv salj2_chunk_0.norm.maf salj2_chunk_0.maf
    [2024-11-12T16:00:53-0500] [MainThread] [I] [toil-rt] 2024-11-12 16:00:53.843772: Running the command: "bash -c set -eo pipefail && cat /scratch/jjsmit3/compar/cactus2/tmpmsk2/toilwf-2d27f925d82e5a9c8caca3fb95810aed/3b7c/job/tmp7bt5mnts/taf_cmds.txt | parallel -j 12 '{}'"
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] Parallel taffy command failed, dumping all stderr
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] taffy: taffy/impl/maf.c:58: maf_read_block: Assertion `alignment->column_number == strlen(row->bases)' failed.
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] /usr/bin/bash: line 1: 4025844 Aborted                 (core dumped) taffy view
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] real 0.59
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] user 0.05
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] sys 0.00
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] /usr/bin/bash: line 1: 4025843 Segmentation fault      (core dumped) mafRowOrderer -m - --order-file genome.list
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] real 0.44
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] user 0.00
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] sys 0.00
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] taffy: taffy/impl/taf.c:128: get_bases: Assertion `strlen(column) == column_length' failed.
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] /usr/bin/bash: line 1: 4025845 Aborted                 (core dumped) taffy norm -a sal10.hal -k
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] real 0.73
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] user 0.17
    [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] sys 0.01
glennhickey commented 1 week ago

Which version of cactus are you using?

On Tue, Nov 12, 2024 at 4:40 PM jeramiahsmith @.***> wrote:

I am using cactus-hal2maf and the conversion seems to be failing at the stage where it is running taffy norm commands. I included a snippet of the stderr below, but please let me know if you need more.

[2024-11-12T16:00:53-0500] [MainThread] [I] [toil-rt] Successfully ran hal2maf sal10.hal stdout --refGenome AM --refSequence Mex_chr10 --start 1712000000 --length 500000 --maxBlockLen 10000 --noAncestors in time: 0.20 [2024-11-12T16:00:53-0500] [MainThread] [I] [toil-rt] Successfully ran hal2maf sal10.hal stdout --refGenome AM --refSequence Mex_chr10 --start 1712500000 --length 375843 --maxBlockLen 10000 --noAncestors in time: 0.10 [2024-11-12T16:00:53-0500] [MainThread] [I] [toil-rt] First of 3426 commands in parallel batch: set -eo pipefail && cat salj2_chunk_0.maf | (time -p mafRowOrderer -m - --order-file genome.list) 2> 0.sort.time | (time -p taffy view ) 2> 0.m2t.time | (time -p taffy norm -a sal10.hal -k ) 2> 0.tn.time > salj2_chunk_0.norm.maf && mv salj2_chunk_0.norm.maf salj2_chunk_0.maf [2024-11-12T16:00:53-0500] [MainThread] [I] [toil-rt] 2024-11-12 16:00:53.843772: Running the command: "bash -c set -eo pipefail && cat /scratch/jjsmit3/compar/cactus2/tmpmsk2/toilwf-2d27f925d82e5a9c8caca3fb95810aed/3b7c/job/tmp7bt5mnts/taf_cmds.txt | parallel -j 12 '{}'" [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] Parallel taffy command failed, dumping all stderr [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] taffy: taffy/impl/maf.c:58: maf_read_block: Assertion alignment->column_number == strlen(row->bases)' failed. [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] /usr/bin/bash: line 1: 4025844 Aborted (core dumped) taffy view [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] real 0.59 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] user 0.05 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] sys 0.00 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] /usr/bin/bash: line 1: 4025843 Segmentation fault (core dumped) mafRowOrderer -m - --order-file genome.list [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] real 0.44 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] user 0.00 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] sys 0.00 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] taffy: taffy/impl/taf.c:128: get_bases: Assertionstrlen(column) == column_length' failed. [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] /usr/bin/bash: line 1: 4025845 Aborted (core dumped) taffy norm -a sal10.hal -k [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] real 0.73 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] user 0.17 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] sys 0.01

— Reply to this email directly, view it on GitHub https://github.com/ComparativeGenomicsToolkit/cactus/issues/1522, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG373V56CJNBDKJKJG3ULD2AJYVDAVCNFSM6AAAAABRVATVJKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGY2TGNBQGMZTIMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

jeramiahsmith commented 1 week ago

I'm using cactus-v2.9.0

glennhickey commented 1 week ago

OK, I asked because I think I've seen this issue before in an older Cactus version that the cause that time has been fixed.

Are you able to share the .hal file with me so I can reproduce it?

jeramiahsmith commented 1 week ago

Let me know if you are able to get it from this gdrive link. sal10.hal https://drive.google.com/open?id=1eb3T8QlNWAKGiCPDUhaQGxdkOyu4Vx0H

On Thu, Nov 14, 2024 at 10:32 AM Glenn Hickey @.***> wrote:

OK, I asked because I think I've seen this issue before in an older Cactus version that the cause that time has been fixed.

Are you able to share the .hal file with me so I can reproduce it?

— Reply to this email directly, view it on GitHub https://github.com/ComparativeGenomicsToolkit/cactus/issues/1522#issuecomment-2476719631, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTZ546FVTHE57HP7QUEYTL2AS7AHAVCNFSM6AAAAABRVATVJKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZWG4YTSNRTGE . You are receiving this because you authored the thread.Message ID: @.***>

-- Jeramiah Smith Professor Department of Biology University of Kentucky Lexington, KY 40506

Confidentiality Statement This e-mail transmission and any files that accompany it may contain sensitive information belonging to the sender. The information is intended only for the use of the individual or entity named. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or the taking of any action in reliance on the contents of this information is strictly prohibited.

jeramiahsmith commented 1 week ago

I was able to run conversion to maf as well as taffy normalization and add-gap-bases by running the commands on the unsplit file external to the cactus pipeline.

glennhickey commented 6 days ago

I've got your file. What's the cactus-hal2maf command that leads to the crash?

jeramiahsmith commented 6 days ago

I ran it several ways, but this one should do

cactus-hal2maf --batchCores 1 --chunkSize 100000 --refGenome AM --workDir tmp --coordinationDir tmp jshm sal10.hal salj.maf

On Fri, Nov 15, 2024 at 2:48 PM Glenn Hickey @.***> wrote:

I've got your file. What's the cactus-hal2maf command that leads to the crash?

— Reply to this email directly, view it on GitHub https://github.com/ComparativeGenomicsToolkit/cactus/issues/1522#issuecomment-2479809035, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTZ54YWZ4NSZXNGUR2VMWL2AZFXJAVCNFSM6AAAAABRVATVJKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZZHAYDSMBTGU . You are receiving this because you authored the thread.Message ID: @.***>

-- Jeramiah Smith Professor Department of Biology University of Kentucky Lexington, KY 40506

Confidentiality Statement This e-mail transmission and any files that accompany it may contain sensitive information belonging to the sender. The information is intended only for the use of the individual or entity named. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or the taking of any action in reliance on the contents of this information is strictly prohibited.

glennhickey commented 6 days ago

Thanks. I can reproduce. At first glance it seems to be due to empty chunks leading to 0-length files which crash some commands. This would explain why you could run successfully without splitting - using a larger --chunkSize with cactus-hal2maf would presumably also work.

This seems like the type of thing that ought to have come up before. Or perhaps there was a regression that's only affecting it recently. In any case, I'll try to get a fix pushed shortly..