Closed lpnunez closed 1 month ago
Yeah, it's pretty slow unfortunately. And only gets slower the more genomes you have in your HAL file.
--defaultCores 20
might help speed things up. Each job is single-threaded, so by assigning 20 cores to each you could be wasting 95% of your cpus.--inMemory
option will speed up each job, but is probably not worth it unless you have tons of memory per core on your system.
Hello, I am trying to extract chain files from a HAL of 60 genomes (65 GB) using cactus-hal2chains. I am running this in SLURM through a singularity container with cactus_v2.6.11-gpu. As I run these jobs, I notice that they are taking an incredibly long time to finish, sometimes between 3-12 hours. When I ran this code with a smaller test HAL of 4 genomes, I was able to generate the chain files in less than 30 minutes, so I am not sure why it is taking hours to generate a single chain file even with this larger HAL file. I am in a time crunch, so I was wondering if there is a way to speed up this process in a more efficient matter that I am not seeing? Here is the code that I am running for these jobs:
cactus-hal2chains ./"js_${genome}" ${INPUTHAL} ${OUTPUTDIR} --refGenome ${ref} --targetGenomes ${genome} --defaultCores 20