ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
481 stars 106 forks source link

Why it seems lastz works only use one core #1392

Closed yuanhao0626 closed 1 month ago

yuanhao0626 commented 1 month ago

Hi deverlopers,

I'm using cactus v2.6.13 for multiple genome alignments. However, it works slow and it seems that lastz works only use one core. My command is: cactus ./js ./seqFile.txt ./hal --defaultCores 64 --defaultMemory 256G --maxCores 96 --maxMemory 1024G --lastzCores 64 --lastzMemory 256G --workDir /PATH/tmp

and the standerr is given for lastz:

[2024-05-20T20:53:23+0800] [MainThread] [I] [toil-rt] 2024-05-20 20:53:23.121408: Running the command: "lastz BBY_40.fa[multiple][nameparse=darkspace] BBY_40_frag[nameparse=darkspace] --step=3 --ambiguous=iupac,100,100 --ungapped --queryhsplimit=keep,nowarn:1500 --querydepth=keep,nowarn:13 --format=general:name1,zstart1,end1,name2,zstart2+,end2+ --markend" [2024-05-20T21:24:04+0800] [MainThread] [I] [toil-rt] 2024-05-20 21:24:04.348083: Successfully ran: "lastz BBY_40.fa[multiple][nameparse=darkspace] BBY_40_frag[nameparse=darkspace] --step=3 --ambiguous=iupac,100,100 --ungapped --queryhsplimit=keep,nowarn:1500 --querydepth=keep,nowarn:13 --format=general:name1,zstart1,end1,name2,zstart2+,end2+ --markend" in 1841.1576 seconds with job-memory 238.5 Gi [2024-05-20T21:24:04+0800] [MainThread] [I] [toil-rt] 2024-05-20 21:24:04.349489: Running the command: "cactus_covered_intervals --origin=one M=20 --queryoffsets" [2024-05-20T21:24:34+0800] [MainThread] [I] [toil-rt] 2024-05-20 21:24:34.895003: Successfully ran: "cactus_covered_intervals --origin=one M=20 --queryoffsets" in 30.4679 seconds with job-memory 238.5 Gi [2024-05-20T21:24:34+0800] [MainThread] [I] [toil-rt] 2024-05-20 21:24:34.896446: Running the command: "cactus_fasta_softmask_intervals.py --origin=one BBY_40.maskinfo" [2024-05-20T21:24:35+0800] [MainThread] [I] [toil-rt] 2024-05-20 21:24:35.342890: Successfully ran: "cactus_fasta_softmask_intervals.py --origin=one BBY_40.maskinfo" in 0.4209 seconds [2024-05-20T21:24:37+0800] [MainThread] [I] [toil.worker] Redirecting logging to /public3/home/wangzn/yuanhao/penguin/WGA-cactus/penguin.24May20/tmp/6d3e00adf7cb5a3f888b6eb730b786b6/f8bf/worker_log.txt [2024-05-20T21:24:38+0800] [MainThread] [I] [toil-rt] 2024-05-20 21:24:38.140460: Running the command: "cactus_fasta_fragments.py --fragment=200 --step=100 --origin=zero" [2024-05-20T21:24:38+0800] [MainThread] [I] [toil-rt] 2024-05-20 21:24:38.685857: Successfully ran: "cactus_fasta_fragments.py --fragment=200 --step=100 --origin=zero" in 0.5327 seconds [2024-05-20T21:24:39+0800] [MainThread] [I] [toil-rt] 2024-05-20 21:24:39.505794: Running the command: "lastz BBY_106.fa[multiple][nameparse=darkspace] BBY_106_frag[nameparse=darkspace] --step=3 --ambiguous=iupac,100,100 --ungapped --queryhsplimit=keep,nowarn:1500 --querydepth=keep,nowarn:13 --format=general:name1,zstart1,end1,name2,zstart2+,end2+ --markend" [2024-05-20T21:43:22+0800] [MainThread] [I] [toil.leader] 1 jobs are running, 138 jobs are issued and waiting to run [2024-05-20T22:26:06+0800] [MainThread] [I] [toil-rt] 2024-05-20 22:26:06.338959: Successfully ran: "lastz BBY_106.fa[multiple][nameparse=darkspace] BBY_106_frag[nameparse=darkspace] --step=3 --ambiguous=iupac,100,100 --ungapped --queryhsplimit=keep,nowarn:1500 --querydepth=keep,nowarn:13 --format=general:name1,zstart1,end1,name2,zstart2+,end2+ --markend" in 3686.8202 seconds with job-memory 238.5 Gi [2024-05-20T22:26:06+0800] [MainThread] [I] [toil-rt] 2024-05-20 22:26:06.340422: Running the command: "cactus_covered_intervals --origin=one M=20 --queryoffsets" [2024-05-20T22:27:19+0800] [MainThread] [I] [toil-rt] 2024-05-20 22:27:19.503612: Successfully ran: "cactus_covered_intervals --origin=one M=20 --queryoffsets" in 73.1517 seconds with job-memory 238.5 Gi [2024-05-20T22:27:19+0800] [MainThread] [I] [toil-rt] 2024-05-20 22:27:19.505672: Running the command: "cactus_fasta_softmask_intervals.py --origin=one BBY_106.maskinfo" [2024-05-20T22:27:20+0800] [MainThread] [I] [toil-rt] 2024-05-20 22:27:20.168411: Successfully ran: "cactus_fasta_softmask_intervals.py --origin=one BBY_106.maskinfo" in 0.6521 seconds [2024-05-20T22:27:23+0800] [MainThread] [I] [toil.worker] Redirecting logging to /public3/home/wangzn/yuanhao/penguin/WGA-cactus/penguin.24May20/tmp/6d3e00adf7cb5a3f888b6eb730b786b6/fe05/worker_log.txt [2024-05-20T22:27:23+0800] [MainThread] [I] [toil-rt] 2024-05-20 22:27:23.778643: Running the command: "cactus_fasta_fragments.py --fragment=200 --step=100 --origin=zero" [2024-05-20T22:27:24+0800] [MainThread] [I] [toil-rt] 2024-05-20 22:27:24.137882: Successfully ran: "cactus_fasta_fragments.py --fragment=200 --step=100 --origin=zero" in 0.3492 seconds [2024-05-20T22:27:25+0800] [MainThread] [I] [toil-rt] 2024-05-20 22:27:25.502423: Running the command: "lastz BBY_75.fa[multiple][nameparse=darkspace] BBY_75_frag[nameparse=darkspace] --step=3 --ambiguous=iupac,100,100 --ungapped --queryhsplimit=keep,nowarn:1500 --querydepth=keep,nowarn:13 --format=general:name1,zstart1,end1,name2,zstart2+,end2+ --markend" [2024-05-20T22:43:24+0800] [MainThread] [I] [toil.leader] 1 jobs are running, 137 jobs are issued and waiting to run

It seems there is only one job is running, even if I set the parameters for lastz. I ran cactus on a PBS cluster and I didnt find if there are some other parameters set for this function.

Thanks in advance for your help.

Best regards, H.Y.

glennhickey commented 1 month ago

You can resolve this issue by updating to the latest Cactus release

yuanhao0626 commented 1 month ago

The latest release (v2.8.2) works as I expected and thank you for your kindly reply! @glennhickey