Harmonisation pipeline is running slow

jiyue1214 commented 2 months ago

Based on the previous test, the harmonisation pipeline can finish 400 studies per day, however, when we ran the harmonisation pipeline using the spotbot account, 400 studies took more than 1 week and were killed by the walltime limitation.

This ticket aims to investigate the reason why the real speed is far from what we expected and any potential ways to improve it.

To-do list:

Release the slurm pipeline
Select 400 studies which had been harmonised (we can confirm all data will run the pipeline)
Run the slurm pipeline for 400 studies using the spotbot account
Run the slurm pipeline for 400 studies using the yue's account
Run the slurm pipeline for 400 studies using the spotbot account, with --exclusive (do not share node with others and run all jobs on this node)
Using nextflow tower to monitor these three jobs

Last: Summary the test result and plan a meeting with TSC for any suggestions on improving the speed.

jiyue1214 commented 2 months ago

I run 400 studies for harmonisation using the gwas_lsf on the Slurm and monitored by nf.tower (workid:4p1vpx9kkebn9q)
I message ITSC for suggestions on how to alloc jobs.#REQ0019119
Suggest to put this ticket in the icebox and report to ITSC again with more total time spent on Slurm.

jiyue1214 commented 2 months ago

Strategy 1: Current Strategy: Using the gwas_lsf account, I can harmonise 400 studies in 2 days.

Strategy 1: Reserve 48-96 CPUs and memory to run the whole job as a local job (queueing job will only depend on the required resources): I reserved 48 CPUs and 300G, and it is still running

EBISPOT / goci

Harmonisation pipeline is running slow #1297