Albert-Lau-Lab / tactics_protein_analysis

GNU General Public License v3.0
17 stars 5 forks source link

TACTICS Pocket Finder Code

This code finds the locations of possible cryptic pockets within MD trajectories.

Installing TACTICS

One of TACTICS's dependencies cannot be installed on Mac OS ≥ Catalina. (See here for details.) Therefore we recommend installing TACTICS on Linux.

TACTICS is strict about which versions of Python packages are used. We recommend installing in a Python virtual environment so the rest of the computer is unaffected.

Standard Installation Instructions

Install pyenv using these instructions.

Download TACTICS with the commandgit clone https://github.com/Albert-Lau-Lab/tactics_protein_analysis.git. Use cd to enter the directory tactics_protein_analysis. Within this directory, run the following commands:

pyenv install 3.8.16
pyenv local 3.8.16

Check that when you type the command python within the TACTICS directory, you get version 3.8.16. (You should still get the default Python when outside the TACTICS directory.)

While in the TACTICS directory, run the following:

sudo apt-get install openbabel
sudo apt-get install autodock-vina
sudo apt-get install concavity
sudo apt-get install pymol

pip install cython
pip install numpy==1.19.5
pip install scikit-learn==0.21.2
pip install mdanalysis==2.2.0
pip install pandas==1.4.4

Download the latest version of MGLTools from here. After downloading MGLTools, run ./install.sh to install it. See MGLTools README for more info.

Install VMD so it can be run using by typing vmd into the terminal.

Modify the TACTICS file get_dock_score.py so that mgltools_loc, pythonsh_loc, and prepare_receptor_loc store the locations of the MGLTools software.

Alternative Installation Using Docker

See the appendix here.

Usage

Location of the Code

Users should run TACTICS while run_model is the working directory.

How to Run the Code

Warning: the MD trajectory should be aligned to itself, so that the center of mass remains constant. This matters because TACTICS finds the change in residue positions; motion of the entire protein would bias this.

TACTICS is run by calling a Python function. Running TACTICS will usually require the following:

Example Usage
import MDAnalysis as mda
from tactics import tactics

output_dir = "test_output"
apo_pdb_loc = "/data/sample_first_frame.pdb"
psf_loc = "/data/sample.psf"
dcd_locs = ["/data/sample1.dcd", "/data/sample2.dcd"]
u = mda.Universe(psf_loc, dcd_list)

tactics(output_dir, apo_pdb_loc, universe=u, num_clusters=8,
        ml_score_thresh=0.8, ml_std_thresh=0.25)

Additional options are discussed in the appendix.

What Files Are Created?

output_dir contains many files. Most of them are created by intermediate steps of the algorithm and aren't useful. Here are the files that are expected to be useful:

Debugging

If the computer runs out of RAM, it may stop running the code and give the error message Killed. If this happens, reduce the size of the input file or free up more RAM.

If the code predicts numerous pockets but each pocket only has one residue, then the segids of the input may be wrong. They must be of the form PROA, PROB, etc.

If the code predicts no pockets, then ml_score_thresh and ml_std_thresh may be too high.

If the code gives the error message KeyError: '1:A', then the apo structure's residue and chain nomenclature may not match the nomenclature used in the trajectory.

Appendix

Alternative Installation Using Docker

Additional Options For TACTICS

Here is a full list of possible arguments to TACTICS:

tactics(output_dir, apo_pdb_loc, psf_loc=None, dcd_loc=None, universe=None,
        num_clusters=None, alt_clustering_method=None, ml_score_thresh=0.8,
        ml_std_thresh=0.25, dock_extra_space=8, clust_max_dist=11)

Here is an explanation of the arguments not discussed above:

Example Using PSF/DCD Instead of Universe
from tactics import tactics

output_dir = "test_output"
apo_pdb_loc = "/data/sample_first_frame.pdb"
psf_loc = "/data/sample.psf"
dcd_loc = "/data/sample.dcd"

tactics(output_dir, apo_pdb_loc, psf_loc=psf_loc, dcd_loc=dcd_loc,
        num_clusters=8, ml_score_thresh=0.8, ml_std_thresh=0.25)