Over the past decade we have witnessed the increasing sophistication of machine learning algorithms applied in daily use from internet searches, voice recognition, social network software to machine vision software in cameras, phones, robots and self-driving cars. Pharmaceutical research has also seen its fair share of machine learning developments. For example, applying such methods to mine the growing datasets that are created in drug discovery not only enables us to learn from the past but to predict a molecule’s properties and behavior in future. The latest machine learning algorithm garnering significant attention is deep learning, which is an artificial neural network with multiple hidden layers. Publications over the last 3 years suggest that this algorithm may have advantages over previous machine learning methods and offer a slight but discernable edge in predictive performance. The time has come for a balanced review of this technique but also to apply machine learning methods such as deep learning across a wider array of endpoints relevant to pharmaceutical research for which the datasets are growing such as physicochemical property prediction, formulation prediction, absorption, distribution, metabolism, excretion and toxicity (ADME/Tox), target prediction and skin permeation, etc. We also show that there are many potential applications of deep learning beyond cheminformatics. It will be important to perform prospective testing (which has been carried out rarely to date) in order to convince skeptics that there will be benefits from investing in this technique.
Main focus is cheminformatics, which is related to virtual screening (#45), but enumerates other biomedical applications as well
Lists many recent pharmaceutical-related machine learning applications that could benefit from deep learning; these seem to be highlighted because of their pharma relevance rather than any particular suitability to deep learning
I completely agree with their statement: "It will be important to perform prospective testing (which has been carried out rarely to date) in order to convince skeptics that there will be benefits from investing in this technique."
Touch on data quality issues and startups in this area at the end
Doesn't cover all of the relevant primary literature in the domain, e.g. cites PubChem as a potential future data source, which is already being used (#55)
Presents an optimistic view overall
I'm closing this issue. We already are familiar with the primary literature covered here.
http://doi.org/10.1007/s11095-016-2029-7