Kortemme-Lab / ddg

A DDG benchmark capture containing the benchmark dataset and benchmarked protocol captures.
MIT License
18 stars 12 forks source link

======================================== Energetic effects of mutation benchmarks

This benchmark capture contains two related benchmarks, both of which measure the accuracy with which a protocol predicts the energetic effects of mutation.


|DDG| (protein stability) benchmark

Monomeric |DDG| protocols predict the change in protein stability which occurs as a result of mutagenesis. This benchmark includes four curated datasets of experimentally measured values which can be used to evaluate the accuracy of a protocol.

This benchmark includes:

Benchmarking a complete dataset is quite computationally intensive so we recommend that a benchmark run be performed on a cluster, grid, or cloud computing resource. We have provided scripts to run this benchmark on a Sun Grid Engine cluster (see hpc/sge/ddg_monomer_16/README.rst).


Alanine scanning benchmark

A frequent application of modeling methods is the prediction of energetically important interactions (“hotspots”) in protein-protein interfaces. By systematically mutating protein interface residues to alanine (“alanine scanning”) and measuring the effect on binding, Wells et al., 1995 <#references>_ showed that not all residues with interface contacts but only a smaller subset of ‘hotspot’ residues contribute significantly to the binding free energy of human growth hormone to its receptor. Subsequent studies suggested that such hotspots may be a general characteristic of many protein-protein interfaces. This benchmark tests the ability of computational alanine scanning protocols to recapitulate the results of experimental alanine scanning. A computational protocol performing well on this test set can then be used for additional applications, for instance, as a design tool to disrupt protein-protein interactions by mutations or through targeting small molecules to hotspots, or to analyze the effect of disease mutations.

This benchmark includes:

The RosettaScripts alanine scanning protocol is not computationally intensive so this benchmark can be performed on a typical laptop or workstation.


Licensing

This repository contains third party libraries and materials which are distributed under their own terms (see LICENSE-3RD-PARTY). The novel content in this repository is licensed according to LICENSE.


Downloading the benchmark

The benchmark is hosted on GitHub. The most recent version can be checked out using the git <http://git-scm.com/>_ command-line tool:

::

git clone https://github.com/Kortemme-Lab/ddg.git


Directories in this archive

This archive contains the following directories:

========= Protocols

This repository contains a protocol which can be used to run the |DDG| benchmark and another which can be used to run the alanine scanning benchmark. We welcome the inclusion of more protocols. Please contact support@kortemmelab.ucsf if you wish to contribute towards the repository.

Each protocol is accompanied by specific documentation in its protocol directory.


Protein stability protocol 1: ddg_monomer, row 16

Created by: Elizabeth Kellogg, Andrew Leaver-Fay, David Baker [1]_

Software suite: Rosetta

Protocol directory: protocols/ddg_monomer_16


Alanine scanning protocol 1: RosettaScripts protocol

Created by: Kyle Barlow

Software suite: Rosetta

Protocol directory: protocols/alanine-scanning

===================================== Running the benchmark: helper scripts

While both the |DDG| and alanine protocols can be run directly on each case using published command line arguments to Rosetta, we have also included helper scripts for each protocol to assist in running them. Both protocol's helper scripts are customized to that protocol, but are used in similar ways.

. The benchmark setup script is run.

This setup script may take in options to determine which subset of the benchmark is run, or what flags will be passed to Rosetta. The setup script will create and copy all necessary input files into a "job output" directory, containing a Python run script.

. (Optional) If the benchmark is to be run on a high-performance cluster, the self-contained generated job output directory can be copied onto that cluster.

. The Python run script (in the job output directory) is run with no arguments.

Rosetta will be called with the appropriate arguments by this run script, and the output saved into the same directory. On a machine with multiple CPUs, Python's multiprocessing module is used to speed the runtime. The script can also be run on a SGE cluster by using the qsub command.

The above steps are repeated two times in the |DDG| protocol (see relevant documentation).

======== Analysis

The same set of analysis scripts is used by all protocols. Conceptually, the analysis scripts should be a black box that is separated from the output of each protocol by an interface. The expected input format is described in analysis/README.rst.

The analysis scripts generates three metrics which can be used to evaluate the results of the |DDG| and alanine scanning simulations and also produces a scatterplot of the experimental and predicted values. The benchmark analysis is described in more detail in analysis/README.rst.

========== References

The latest release of this repository: |releasedoi|

.. |releasedoi| image:: https://zenodo.org/badge/doi/10.5281/zenodo.18595.svg
:target: http://dx.doi.org/10.5281/zenodo.18595


|DDG| (protein stability) benchmark

Kellogg, EH, Leaver-Fay, A, Baker, D. Role of conformational sampling in computing mutation-induced changes in protein structure and stability. 2011. Proteins. 79(3):830-8. doi: 10.1002/prot.22921 <https://dx.doi.org/10.1002/prot.22921>_.


Alanine scanning benchmark

Clackson T, Wells JA. A hot spot of binding energy in a hormone-receptor interface. Science. 1995 Jan 20;267(5196):383-6. doi: 10.1126/science.7529940 <https://dx.doi.org/10.1126/science.7529940>_.

Kortemme, T, Baker, D. A simple physical model for binding energy hot spots in protein–protein complexes. Proc Natl Acad Sci U S A. 2002 Oct 29;99(22):14116-21. Epub 2002 Oct 15. doi: 10.1073/pnas.202485799 <https://dx.doi.org/10.1073/pnas.202485799>_.

Kortemme T, Kim DE, Baker D. Computational alanine scanning of protein-protein interfaces. Sci STKE. 2004 Feb 3;2004(219):pl2. doi: 10.1126/stke.2192004pl2 <https://dx.doi.org/10.1126/stke.2192004pl2>_.

===== Notes

.. [1] The Rosetta application was written by the authors above. This protocol capture was compiled by Shane O'Connor. Any errors in the protocol capture are likely to be the fault of the compiler rather than that of the original authors. Please contact support@kortemmelab.ucsf.edu with any issues which may arise.

.. |Dgr| unicode:: U+00394 .. GREEK CAPITAL LETTER DELTA .. |ring| unicode:: U+002DA .. RING ABOVE .. |DDGH2O| replace:: |Dgr|\ |Dgr|\ G H\ :sub:2\ O .. |DDG| replace:: |Dgr|\ |Dgr|\ G