KosinskiLab / AlphaPulldownSnakemake

GNU General Public License v3.0
3 stars 0 forks source link

Add clustering and padding #12

Open dingquanyu opened 3 months ago

dingquanyu commented 3 months ago
  1. First group different jobs into clusters based on the number of residues and number of MSAs to pad
  2. Model a group of jobs together Combining 2 steps into one workflow files poses a challenge to snakemake as the exact number and the names of the files are unknown a priori. Thus I used checkpoint to dynamically regulate the workflow. However, using a checkpoint means all job clusters will be run on one GPU, one cluster after another.

This is why I developed an alternative version, in PR #13 , in which the 2 steps are split. In that version, one has to run group_jobs.smk first then group_jobs_and_predict.smk secondly. But in PR #13 , each job cluster will be dispatched to one GPU and run in parallel.