BSDExabio / PSP

protein structure prediction
1 stars 0 forks source link

Develop framework for distributed, parallel AlphaFold processes #5

Closed markcoletti closed 3 years ago

markcoletti commented 3 years ago

Implement a framework that uses dask to manage distributing the AlphaFolk work. This is in contrast to Mu's work that uses MPI to manage distributed loads.

markcoletti commented 3 years ago

Yesterday, got a simple distributed dask frame work working locally and on Summit. Essentially takes a text file of proteins and distributes those to dask workers.

markcoletti commented 3 years ago

So, today was focused on final integration with alphafold. First up: just getting the example to run for me. After some back and forth got that to work.

markcoletti commented 3 years ago

So, next is having a dask worker call alphafold for a single protein. I think for now I'll go with the inelegant solution of calling the existing run_alphafold_stage2a.py script but with a single protein to see if I can get that to work. We can worry about cleaning it up later.

markcoletti commented 3 years ago

Job 1381691 output here: /gpfs/alpine/bip198/scratch/mcoletti/runs/issue-5/1381691. This was just for two proteins, which was fine for stress testing that this works.