hillerlab / make_lastz_chains

Portable solution to generate genome alignment chains using lastz
MIT License
47 stars 8 forks source link

Speed up task processing #59

Open molinfzl opened 5 months ago

molinfzl commented 5 months ago

Hi,I'm hoping you could provide some guidance or suggestions.

my command: ./make_chains.py hg38 Ea /data/groups/home/make_lastz_chains/genome/hg38.2bit /data/groups/home/make_lastz_chains/genome/Ea.2bit --pd /data/groups/public/home/Chains/Ea -f --chaining_memory 16

I use --executor local instead of --executor slurm, is there any parameter to speed it up, it's too slow

current output : 1、 N E X T F L O W ~ version 21.10.6 Launching /data/groups/g1600002/home/make_lastz_chains/parallelization/execute_joblist.nf [intergalactic_khorana] - revision: 0483b29723 executor > local (5) [5f/78dfbd] process > execute_jobs (2) [ 0%] 4 of 105150

2、 N E X T F L O W ~ version 21.10.6 Launching /home/make_lastz_chain/make_lastz_chains/parallelization/execute_joblist.nf [wise_austin] - revision: 0483b29723 executor > local (207) [78/011ee5] process > execute_jobs (223) [ 9%] 196 of 2010

In addition, I noticed in the previous Q&As that there is a parameter "--executor_queuesize 8" that can make it multi-threaded, but it seems to no longer exist in the current version, whether it has been replaced by "--cluster_queue 8", My running speed is about 5% of the number of jobs finished a day, do you have any suggestions for this, thanks a lot

I later learned that maf files can be generated through lastal, and then chain files can be generated through two-step transformation of mafToAxt and axtChain. Is this suitable as the input file of TOGA? This will greatly increase the efficiency.

Thank you for taking the time to read this! I'm looking forward to hearing your thoughts on this matter.

laristide commented 1 month ago

Hi, I'm interested in this questions too. If I have a MAF file with an alignment, can I somehow use this to create the chains and save some computations? The final aim is to run TOGA.

Thank you very much, Leandro

MichaelHiller commented 1 month ago

Hi laristide,

let me ping @kirilenkobm who can hopefully help with the nextflow issue.

Wrt maf to chain: I wouldn't recommend this, as the result will not be close the chains this pipeline produces. We have a supplement figure showing that chains produced with less sensitive parameters and without RepeatFiller + chainCleaner produce worse results with TOGA.

laristide commented 1 month ago

Thank you! I'll check those supplementary figures.