jmborr / idpflex

Analysis of intrinsically disordered proteins by comparing MD simulations to Small Angle Scattering experiments
http://idpflex.readthedocs.io/en/latest/
MIT License
3 stars 4 forks source link

parallelize calls to an external executable #39

Open jmborr opened 6 years ago

jmborr commented 6 years ago

Some property class like SecondaryStructureProperty and SaxsProperty can be instatiated using an executable command through a method call. For instance:

saxs_prop = SaxsProperty().from_crysol_pdb(pdb_file)

will create X-ray property saxs_prop by running executable crysol using PDB file pdb_file. Analogously,

ss_prop = SecondaryStructureProperty().from_dssp_pdb(pdb_file)

will create secondary structure property ss_prop by running executable dssp using PDB file pdb_file.

The task is to create a utility function so that given: (1)a property class, (2) a method, and (3) a list of files will create one property for each file, and this should be done in parallel.

For instance,

import multiprocessing
from tqdm import tqdm
with multiprocessing.Pool(processes=multiprocessing.cpu_count()) as pool:
    profiles = list(tqdm(pool.imap(SaxsProperty().from_crysol_pdb, pdb_names),
                         total=len(pdb_names)))

will create list of saxs properties profiles using all available cores.

The utility function should be implemented in file utils.py