greenelab / deep-review

A collaboratively written review paper on deep learning, genomics, and precision medicine
https://greenelab.github.io/deep-review/
Other
1.24k stars 272 forks source link

Structure-based prediction of protein–protein interactions on a genome-wide scale #581

Open zietzm opened 6 years ago

zietzm commented 6 years ago

https://doi.org/10.1038/nature11503

The genome-wide identification of pairs of interacting proteins is an important step in the elucidation of cell regulatory mechanisms1, 2. Much of our present knowledge derives from high-throughput techniques such as the yeast two-hybrid assay and affinity purification3, as well as from manual curation of experiments on individual systems4. A variety of computational approaches based, for example, on sequence homology, gene co-expression and phylogenetic profiles, have also been developed for the genome-wide inference of protein–protein interactions (PPIs)5, 6. Yet comparative studies suggest that the development of accurate and complete repertoires of PPIs is still in its early stages7, 8, 9. Here we show that three-dimensional structural information can be used to predict PPIs with an accuracy and coverage that are superior to predictions based on non-structural evidence. Moreover, an algorithm, termed PrePPI, which combines structural information with other functional clues, is comparable in accuracy to high-throughput experiments, yielding over 30,000 high-confidence interactions for yeast and over 300,000 for human. Experimental tests of a number of predictions demonstrate the ability of the PrePPI algorithm to identify unexpected PPIs of considerable biological interest. The surprising effectiveness of three-dimensional structural information can be attributed to the use of homology models combined with the exploitation of both close and remote geometric relationships between proteins.

This paper isn't an application of deep learning as they are using a Bayesian network. The methods and results could be interesting in the context of a PPI section (#575) though, as it is a very similar basis of comparison.

agitter commented 6 years ago

To help keep the PPI section to a reasonable length, I suggest we focus heavily on neural network-based methods. However, it can make sense to refer to alternative approaches to assess whether deep learning has surpassed those methods or contrast deep learning with traditional approaches. Is that what you had in mind for this paper?

As an example from the TF binding section:

In order to computationally predict transcription factor binding sites (TFBSs) on a DNA sequence, researchers initially used consensus sequences and position weight matrices to match against a test sequence [161].