prody / ProDy

A Python Package for Protein Dynamics Analysis
http://prody.csb.pitt.edu
Other
417 stars 153 forks source link

Segmentation fault on ANM calculations on large systems, despite of tones of avaliable RAM #220

Closed rmendez closed 4 months ago

rmendez commented 9 years ago

Dear all,

I am trying to perform ANM calculations on a tetrameric membrane protein (PDB ID 3J8H, chains A,C,E,G) using only CA atoms, which amount a total of 14640 atoms. In our unversity we have a computer cluster with up to 1 TB of RAM, so for testing purposes I am booking up to 800 Gb, but the jobs crashes producing a segmentation fault (and core dump). Here it is the script:


from prody import * import matplotlib as mpl mpl.use('Agg') import matplotlib.pylab as plab import matplotlib.pyplot as plt import numpy as np

RyR_closed = parsePDB('3J8H_ACEG.pdb') RyR_closed = RyR_closed.select('protein and name CA')

compute ANM modes

anm = ANM('RyR closed ANM analysis') anm.buildHessian(RyR_closed) anm.calcModes()

write Modes to be represented by VMD

to be read by parseNMD subroutine

writeNMD('RyR_closed.ANM.nmd',anm,RyR_closed) quit()


And here is the error:

@> 107796 atoms and 1 coordinate set(s) were parsed in 1.23s. @> Hessian was built in 65.85s. /netscr/EX/lsf/killdevil/lsbatch/1420483745.214510: line 8: 12861 Segmentation fault (core dumped) ./prody_anm.CA.py

However I know that the maximum memory used is about 29 Gb:

Exited with exit code 139.

Resource usage summary:

CPU time :                                   5526.03 sec.
Max Memory :                                 29 GB
Average Memory :                             28.32 GB
Total Requested Memory :                     800.00 GB
Delta Memory :                               771.00 GB
Max Swap :                                   31 GB
Max Processes :                              3
Max Threads :                                12
Run time :                                   4616 sec.
Turnaround time :                            4683 sec.

I see two possible problems (not being a python programer):

1) Python memory allocation for arrays is predetermined to a size below the one needed in the diagonalization process (Hessian computation is made extremely fast)

2) Or perhaps it is known that LAPACK/BPACK subroutines used as diagonalizer (the ones in C) are not able to handle this array size.

Can anyone help me to perform ANM on large systems ?

Thank you so much,

Raul

cihankayacihan commented 8 years ago

Lapack is dealing with those over scipy. I guess it will be due to diagonalizer. Currently, we have an effort to build gpu based eigenvalue decomposition for large molecules with magma. With 1.9, it will be online.

jamesmkrieger commented 5 years ago

This is still in progress. There's a GPU implementation pretty much complete but not integrated yet.

jamesmkrieger commented 3 years ago

You could maybe try using Dask. This package handles much larger arrays and splits the computations into parallel units. They have a singular value decomposition (https://docs.dask.org/en/latest/array-api.html?highlight=linalg#dask.array.linalg.svd) that you could use for the matrix decomposition maybe. I have also thought about incorporating it, but never got a chance.

Alternatively, you could try coarse-graining further. One option could be to take ever second or third residue or even larger gaps (see https://pubmed.ncbi.nlm.nih.gov/11913377/ where this seems to work up to 1/40 of residues using a shifting scheme) or take the average position of every few residues. Another option is more rigorous Markov methods for hierarchical coarse-graining (https://pubmed.ncbi.nlm.nih.gov/17691893/) although this might be more challenging.

Another alternative could be to use our new interface CryoDy to obtain a coarse grained elastic network directly from a cryo-EM map (see http://prody.csb.pitt.edu/tutorials/cryoem_tutorial/).

Best wishes James

jamesmkrieger commented 1 year ago

I was just trying this yesterday. I think the best option is to use buildHessian with sparse=True and calcModes with turbo=False. It may also help to limit the number of cpus.

What I did was the following:

In [1]: from prody import *

In [2]: two_cct = parsePDB("4v8r")
@> Connecting wwPDB FTP server RCSB PDB (USA).
@> pdb4v8r.ent.gz download failed. 4v8r does not exist on ftp.wwpdb.org.
@> PDB download via FTP completed (0 downloaded, 1 failed).
@> Downloading PDB files via FTP failed, trying HTTP.
@> WARNING 4v8r download failed ('https://files.rcsb.org/download/4V8R.pdb.gz' could not be opened for reading, invalid URL or no internet connection).
@> PDB download via HTTP completed (0 downloaded, 1 failed).
@> WARNING Trying to parse mmCIF file instead
@> WARNING Could not find _pdbx_branch_scheme in lines.
@> WARNING Could not find _pdbx_struct_mod_residue in lines.
@> WARNING Could not find _atom_site_anisotrop in lines.
@> WARNING No anisotropic B factors found
@> 128780 atoms and 1 coordinate set(s) were parsed in 2.64s.
@> Secondary structures were assigned to 10973 residues.

In [3]: anm = ANM("all calphas")

In [4]: from threadpoolctl import threadpool_limits

In [5]: with threadpool_limits(limits=6, user_api="blas"):
   ...:     anm.buildHessian(two_cct.ca, cutoff=20, gamma=1, sparse=True, kdtree=False)
   ...:     anm.calcModes(5, zeros=False, turbo=False)

I've found that limiting the number of threads can also speed up the calculations sometimes