Closed jeramiahsmith closed 3 days ago
Which version of cactus are you using?
On Tue, Nov 12, 2024 at 4:40 PM jeramiahsmith @.***> wrote:
I am using cactus-hal2maf and the conversion seems to be failing at the stage where it is running taffy norm commands. I included a snippet of the stderr below, but please let me know if you need more.
[2024-11-12T16:00:53-0500] [MainThread] [I] [toil-rt] Successfully ran hal2maf sal10.hal stdout --refGenome AM --refSequence Mex_chr10 --start 1712000000 --length 500000 --maxBlockLen 10000 --noAncestors in time: 0.20 [2024-11-12T16:00:53-0500] [MainThread] [I] [toil-rt] Successfully ran hal2maf sal10.hal stdout --refGenome AM --refSequence Mex_chr10 --start 1712500000 --length 375843 --maxBlockLen 10000 --noAncestors in time: 0.10 [2024-11-12T16:00:53-0500] [MainThread] [I] [toil-rt] First of 3426 commands in parallel batch: set -eo pipefail && cat salj2_chunk_0.maf | (time -p mafRowOrderer -m - --order-file genome.list) 2> 0.sort.time | (time -p taffy view ) 2> 0.m2t.time | (time -p taffy norm -a sal10.hal -k ) 2> 0.tn.time > salj2_chunk_0.norm.maf && mv salj2_chunk_0.norm.maf salj2_chunk_0.maf [2024-11-12T16:00:53-0500] [MainThread] [I] [toil-rt] 2024-11-12 16:00:53.843772: Running the command: "bash -c set -eo pipefail && cat /scratch/jjsmit3/compar/cactus2/tmpmsk2/toilwf-2d27f925d82e5a9c8caca3fb95810aed/3b7c/job/tmp7bt5mnts/taf_cmds.txt | parallel -j 12 '{}'" [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] Parallel taffy command failed, dumping all stderr [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] taffy: taffy/impl/maf.c:58: maf_read_block: Assertion
alignment->column_number == strlen(row->bases)' failed. [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] /usr/bin/bash: line 1: 4025844 Aborted (core dumped) taffy view [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] real 0.59 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] user 0.05 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] sys 0.00 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] /usr/bin/bash: line 1: 4025843 Segmentation fault (core dumped) mafRowOrderer -m - --order-file genome.list [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] real 0.44 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] user 0.00 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] sys 0.00 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] taffy: taffy/impl/taf.c:128: get_bases: Assertion
strlen(column) == column_length' failed. [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] /usr/bin/bash: line 1: 4025845 Aborted (core dumped) taffy norm -a sal10.hal -k [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] real 0.73 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] user 0.17 [2024-11-12T16:03:31-0500] [MainThread] [E] [toil.statsAndLogging] sys 0.01— Reply to this email directly, view it on GitHub https://github.com/ComparativeGenomicsToolkit/cactus/issues/1522, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG373V56CJNBDKJKJG3ULD2AJYVDAVCNFSM6AAAAABRVATVJKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGY2TGNBQGMZTIMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I'm using cactus-v2.9.0
OK, I asked because I think I've seen this issue before in an older Cactus version that the cause that time has been fixed.
Are you able to share the .hal
file with me so I can reproduce it?
Let me know if you are able to get it from this gdrive link. sal10.hal https://drive.google.com/open?id=1eb3T8QlNWAKGiCPDUhaQGxdkOyu4Vx0H
On Thu, Nov 14, 2024 at 10:32 AM Glenn Hickey @.***> wrote:
OK, I asked because I think I've seen this issue before in an older Cactus version that the cause that time has been fixed.
Are you able to share the .hal file with me so I can reproduce it?
— Reply to this email directly, view it on GitHub https://github.com/ComparativeGenomicsToolkit/cactus/issues/1522#issuecomment-2476719631, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTZ546FVTHE57HP7QUEYTL2AS7AHAVCNFSM6AAAAABRVATVJKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZWG4YTSNRTGE . You are receiving this because you authored the thread.Message ID: @.***>
-- Jeramiah Smith Professor Department of Biology University of Kentucky Lexington, KY 40506
Confidentiality Statement This e-mail transmission and any files that accompany it may contain sensitive information belonging to the sender. The information is intended only for the use of the individual or entity named. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or the taking of any action in reliance on the contents of this information is strictly prohibited.
I was able to run conversion to maf as well as taffy normalization and add-gap-bases by running the commands on the unsplit file external to the cactus pipeline.
I've got your file. What's the cactus-hal2maf
command that leads to the crash?
I ran it several ways, but this one should do
cactus-hal2maf --batchCores 1 --chunkSize 100000 --refGenome AM --workDir tmp --coordinationDir tmp jshm sal10.hal salj.maf
On Fri, Nov 15, 2024 at 2:48 PM Glenn Hickey @.***> wrote:
I've got your file. What's the cactus-hal2maf command that leads to the crash?
— Reply to this email directly, view it on GitHub https://github.com/ComparativeGenomicsToolkit/cactus/issues/1522#issuecomment-2479809035, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTZ54YWZ4NSZXNGUR2VMWL2AZFXJAVCNFSM6AAAAABRVATVJKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZZHAYDSMBTGU . You are receiving this because you authored the thread.Message ID: @.***>
-- Jeramiah Smith Professor Department of Biology University of Kentucky Lexington, KY 40506
Confidentiality Statement This e-mail transmission and any files that accompany it may contain sensitive information belonging to the sender. The information is intended only for the use of the individual or entity named. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or the taking of any action in reliance on the contents of this information is strictly prohibited.
Thanks. I can reproduce. At first glance it seems to be due to empty chunks leading to 0-length files which crash some commands. This would explain why you could run successfully without splitting - using a larger --chunkSize
with cactus-hal2maf
would presumably also work.
This seems like the type of thing that ought to have come up before. Or perhaps there was a regression that's only affecting it recently. In any case, I'll try to get a fix pushed shortly..
I am using cactus-hal2maf and the conversion seems to be failing at the stage where it is running taffy norm commands. I included a snippet of the stderr below, but please let me know if you need more.