ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
481 stars 106 forks source link

cactus-hal2chains taking a long time #1388

Closed lpnunez closed 1 month ago

lpnunez commented 1 month ago

Hello, I am trying to extract chain files from a HAL of 60 genomes (65 GB) using cactus-hal2chains. I am running this in SLURM through a singularity container with cactus_v2.6.11-gpu. As I run these jobs, I notice that they are taking an incredibly long time to finish, sometimes between 3-12 hours. When I ran this code with a smaller test HAL of 4 genomes, I was able to generate the chain files in less than 30 minutes, so I am not sure why it is taking hours to generate a single chain file even with this larger HAL file. I am in a time crunch, so I was wondering if there is a way to speed up this process in a more efficient matter that I am not seeing? Here is the code that I am running for these jobs:

cactus-hal2chains ./"js_${genome}" ${INPUTHAL} ${OUTPUTDIR} --refGenome ${ref} --targetGenomes ${genome} --defaultCores 20

glennhickey commented 1 month ago

Yeah, it's pretty slow unfortunately. And only gets slower the more genomes you have in your HAL file.