encore-similarity / encore

An extension to the MDAnalysis library providing support for dealing with structural ensembles. There is currently support for calculating covariance matrices, ensemble similarities, entropy and conducting PCA analyses.
GNU General Public License v3.0
13 stars 2 forks source link

ENCORE availability

The ENCORE software is now integrated into the MDAnalysis Python library as an analysis module - please head to the MDAnalysis website or the MDAnalysis github repository for installation and usage instructions. This is the recommended and most up to date version to use for your ENCORE analysis.

ENCORE

ENCORE is a Python package designed to quantify the similarity between conformational ensembles of proteins (or in principle other macromolecules), using three different methods originally described in:

Lindorff-Larsen K, Ferkinghoff-Borg J (2009) Similarity Measures for Protein Ensembles. PLoS ONE 4(1): e4203. doi:10.1371/journal.pone.0004203

A description of ENCORE and a number of application can be found in:

Matteo Tiberti, Elena Papaleo, Tone Bengtsen, Wouter Boomsma and Kresten Lindorff-Larsen, ENCORE: Software for quantitative ensemble comparison Submitted

The package includes facilities for handling ensembles and trajectories, performing clustering or dimensionality reduction of the ensemble space, estimating multivariate probability distributions from the input data, and more. ENCORE can be used to compare experimental and simulation-derived ensembles, as well as estimate the convergence of trajectories from time-dependent simulations. The package was designed as a Python 2.6 (or any higher 2.X version) library. The user may also use some of the library files as scripts that accept command line arguments. Usually, the help text included for each script (obtained running "python encore/script.py -h") is self-explanatory. Examples are also available on how ENCORE may be used to calculate the similarity measures on a number of ensembles.

The similarity measures implemented in ENCORE are based on three different methods, which all rely on the following idea: Given two or more conformational ensembles of the same topology (i.e. structure), we view the particular set of conformations from each ensemble as a sample from an underlying, but unknown, probability distribution. We use this sample to model the probability density function of said distribution. Then we compare the modeled distributions using standard measures of the similarity between two probability densities, such as the Jensen-Shannon divergence.

In the ENCORE package, we have implemented three methods to estimate the density:

ENCORE is able to use, as input data, structural ensembles deriving both from molecular simulations (e.g. molecular dynamics or Monte Carlo methods) or experimental structural ensembles (e.g. NMR structures as PDB files). The software is able to handle the most popular trajectory formats (files such as DCD, XTC, TRR, XYZ, TRJ, MDCRD), although periodic boundaries conditions must be removed before use. A topology file is also required.

Together with the software, we also provide three examples that showcase three typical cases of study:

See the examples themselves for more information. If you use ENCORE for your scientific work, please cite:

Matteo Tiberti, Elena Papaleo, Tone Bengtsen, Wouter Boomsma and Kresten Lindorff-Larsen, ENCORE: Software for quantitative ensemble comparison Submitted