proteusPy is a Python package specializing in the modeling and analysis of proteins of known structure with an emphasis on Disulfide bonds. This package reprises my molecular modeling program Proteus, a structure-based program developed as part of my graduate thesis. The package relies on the Turtle3D class to create and manipulate local coordinate systems. It does this by implementing the functions Move
, Roll
, Yaw
, Pitch
and Turn
for movement in a three-dimensional space. The initial implementation focuses on the Disulfide class. The class implements methods to analyze the protein structure stabilizing element known as a Disulfide Bond. This class and its underlying methods are being used to perform a structural analysis of over 36,900 disulfide-bond containing proteins in the RCSB protein data bank (https://www.rcsb.org).
See API Reference for the API documentation with examples.
It's simplest to clone the repo via GitHub since it contains all of the notebooks, data and test programs. Installation includes installing my Biopython fork which is required to rebuild the database (this is not needed generally). I highly recommend using Miniforge since it includes mamba. The installation instructions below assume a clean install with no package manager or compiler installed.
make
on your system.From a shell prompt while sitting in your repo dir:
$ git clone https://github.com/suchanek/proteusPy.git
$ cd proteusPy
$ make pkg
$ conda activate proteusPy
$ make install
(base) C:\Users\egs\repos> git clone https://github.com/suchanek/proteusPy.git
(base) C:\Users\egs\repos> cd proteusPy
(base) C:\Users\egs\repos\proteuspy> make pkg
(base) C:\Users\egs\repos>\proteuspy> conda activate proteusPy
(proteusPy) C:\Users\egs\repos> make install
I currently have pytest
and docstring testing for the modules in place. To run them cd
into the repository and run:
$ make tests
The modules will run their docstring tests and disulfide visualization windows will open. Simply close them. If all goes normally there will be no errors. If you're not running the development version of proteusPy you may need to install pytest
. Simply perform: pip install pytest
. Docstring testing is sensitive to formatting; occasionally the black
formatter changes the docstrings. As a result there may be some docstring tests that fail.
Once the package is installed it's possible to load, visualize and analyze the Disulfide bonds in the RCSB Disulfide database. The general approach is:
A simple example is shown below:
import proteusPy
from proteusPy import Load_PDB_SS, Disulfide
PDB_SS = Load_PDB_SS(verbose=True)
best_ss = PDB_SS["2q7q_75D_140D"]
best_ss.display(style="sb", light=True)
The notebooks directory contains my Jupyter notebooks and is a good place to start:
The programs subdirectory contains the primary programs for downloading the RCSB disulfide-containing structure files, extracting the disulfides and creating the disulfide database:
The first time one loads the database via Load_PDB_SS() the system will attempt to download the full and subset database from Google Drive.
After installation is complete launch jupyter:
$ jupyter notebook
and open notebooks/Analysis_2q7q.ipynb
. This notebook looks at the disulfide bond with the lowest energy in the entire database. There are several other notebooks in this directory that illustrate using the program. Some of these reflect active development work so may not be 'fully baked'.
PyVista is an excellent 3D visualization framework and I've used it for the Disulfide visualization engine. It uses the VTK library on the back end and provides high-level access to 3d rendering. The menu strip provided in the Disulfide visualization windows allows the user to turn borders, rulers, bounding boxes on and off and reset the orientations. Please try them out! There is also a button for local vs server rendering. Local rendering is usually much smoother. To manipulate:
I welcome anyone interested in collaborating on proteusPy! Feel free to contact me at mailto:suchanek@mac.com, fork the repository: https://github.com/suchanek/proteusPy/ and get coding. Issues can be reported to https://github.com/suchanek/proteusPy/issues.
The proteusPy package was developed by Eric G. Suchanek, PhD. If you find it useful in your research and wish to cite it please use the following BibTeX entry:
@article{Suchanek2024,
doi = {10.21105/joss.06169},
url = {https://doi.org/10.21105/joss.06169},
year = {2024},
publisher = {The Open Journal},
volume = {9},
number = {100},
pages = {6169},
author = {Eric G. Suchanek},
title = {proteusPy: A Python Package for Protein Structure and Disulfide Bond Modeling and Analysis},
journal = {Journal of Open Source Software}
}
@software{proteusPy2024,
author = {Eric G. Suchanek, PhD},
title = {proteusPy: A Package for Modeling and Analyzing Proteins of Known Structure},
year = {2024},
publisher = {GitHub},
version = {0.96},
journal = {GitHub repository},
url = {https://github.com/suchanek/proteusPy}
}