greenelab / deep-review

A collaboratively written review paper on deep learning, genomics, and precision medicine
https://greenelab.github.io/deep-review/
Other
1.25k stars 271 forks source link

Convolutional Networks on Graphs for Learning Molecular Fingerprints #52

Closed agitter closed 6 years ago

agitter commented 8 years ago

http://arxiv.org/abs/1509.09292

At a glance: Related to virtual screening #45. Molecular fingerprints are one standard way to featurize chemical compounds for virtual screening. This paper adapts the standard fingerprinting algorithm by implementing it as a neural network and describes the advantages of doing so. Notably, it outputs real-valued fingerprints instead of binary fingerprints.

agitter commented 7 years ago

This was presented at NIPS 2015 so I'll use that for the official reference instead of the arXiv version above. The reviews could provide useful context.

swamidass commented 7 years ago

So re: #52, which is written by my friend Aparu. They have an interesting approach/architecture that should certainly be studied. Our group was excited about how they were able connect predictions back to structure too.

Respectfully, there are several problems with this study that severely bracket any conclusions that might be drawn. It is right that this paper was published in 2015, but I by 2017 or 2018 a paper like this should be unpublishable if reviewers are doing their jobs.

First, it appears that they are not doing the benchmarks correctly. At minimum they should have done Maximum Similarity using ECFP-like fingerprints and Tanimoto Similarity. Instead, they used the approach that I already noted is really substandard and no one uses (neural network or logistic regressor on fingerprint vector). Ideally they should also consider SVMs (with Tanimoto Kernel) and IRVs. Likewise they should have used standard datasets and reported results from other groups in a comparable way.

Second, they only validate on 3 datasets, and only one of them is bioactivity. Look at the IRV paper I published the same year. We validate on hundreds (I think over 1000) datasets! And we see robust improvement of our approach over other state of the art methods. This just too low of a sample size of make any conclusions about accuracy other than to point out there improvement over a poorly executed baseline method is quite low.

Third, the most exciting part about the paper was the ability to trace models back to specific substructures. However this turns out to be an illusion. This was done in a fairly ad hoc way, and it is not clear if the process can be automated and the specific weight cutoffs can be generalized or justified.

This is all important. I think it is possible that DL will produce real gains (over what has already been realized by IRVs, metabolism networks, etc.), but it far from clear if this is the case. We have a proliferation of new methods without clear evidence that they are actually improving accuracy right now. In fact, the data currently seems to suggest the opposite.

swamidass commented 7 years ago

Looking at the reveiwer's comments, they lacked expertise in chemical informatics.