Add clustering and padding

First group different jobs into clusters based on the number of residues and number of MSAs to pad
Model a group of jobs together Combining 2 steps into one workflow files poses a challenge to snakemake as the exact number and the names of the files are unknown a priori. Thus I used checkpoint to dynamically regulate the workflow. However, using a checkpoint means all job clusters will be run on one GPU, one cluster after another.

This is why I developed an alternative version, in PR #13 , in which the 2 steps are split. In that version, one has to run group_jobs.smk first then group_jobs_and_predict.smk secondly. But in PR #13 , each job cluster will be dispatched to one GPU and run in parallel.

KosinskiLab / AlphaPulldownSnakemake

Add clustering and padding #12