ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
523 stars 111 forks source link

long time in the segalign_repeat fo the segalign step #1218

Closed ld9866 closed 1 year ago

ld9866 commented 1 year ago

Dear developer: We are conducting a comparison of three Sus genomes using the RTX4090 , which performs very well and completes the first genome segalign in 10 minutes_ The repeated process, but in the second genome, we have [MainThread] [I] [tool. leader] 1 job are running, 1 job are issued and waiting to run. It is now 10 hours later and the error is still there. How can this be fixed? Here, I have uploaded our log file, please help check what is the problem. Best yours Sus.log

glennhickey commented 1 year ago

From the end of your log

[2023-11-02T01:45:05+0000] [MainThread] [I] [toil-rt] 2023-11-02 01:45:05.961989: Running the command: "segalign_repeat_masker /tmp/02ea00512cef52d0a2f6d5d09fa4b68f/56a8/d853/tmpzz32by18/Sscrofa_0_0.tgt --lastz_interval=10000000 --markend --neighbor_proportion 0.2 --M 10 --step=3 --ambiguous=iupac,100,100 --num_gpu 2"
[2023-11-02T02:33:35+0000] [MainThread] [I] [toil.leader] 1 jobs are running, 1 jobs are issued and waiting to run
[2023-11-02T03:33:36+0000] [MainThread] [I] [toil.leader] 1 jobs are running, 1 jobs are issued and waiting to run
[2023-11-02T04:33:37+0000] [MainThread] [I] [toil.leader] 1 jobs are running, 1 jobs are issued and waiting to run
[2023-11-02T05:33:37+0000] [MainThread] [I] [toil.leader] 1 jobs are running, 1 jobs are issued and waiting to run
[2023-11-02T06:33:38+0000] [MainThread] [I] [toil.leader] 1 jobs are running, 1 jobs are issued and waiting to run

There's a SegAlign job that's been going for 5 (not 10) hours. The runtime of segalign_repeat_masker varies wildly with the input data. I've recently seen it take 12 hours on a lemur genome with 8 GPUs. So I don't see any errors or problems, the only thing you can do is wait for it to finish.