DeepProfile: Deep learning of cancer molecular profiles for precision medicine

stephenra commented 6 years ago

We present the DeepProfile framework, which learns a variational autoencoder (VAE) network from thousands of publicly available gene expression samples and uses this network to encode a low-dimensional representation (LDR) to predict complex disease phenotypes. To our knowledge, DeepProfile is the first attempt to use deep learning to extract a feature representation from a vast quantity of unlabeled (i.e, lacking phenotype information) expression samples that are not incorporated into the prediction problem. We use DeepProfile to predict acute myeloid leukemia patients' in vitro responses to 160 chemotherapy drugs. We show that, when compared to the original features (i.e., expression levels) and LDRs from two commonly used dimensionality reduction methods, DeepProfile: (1) better predicts complex phenotypes, (2) better captures known functional gene groups, and (3) better reconstructs the input data. We show that DeepProfile is generalizable to other diseases and phenotypes by using it to predict ovarian cancer patients' tumor invasion patterns and breast cancer patients' disease subtypes.

https://www.biorxiv.org/content/early/2018/05/26/278739

cgreene commented 6 years ago

I thought this paper was really cool, so I sent it out for a biOverlay. The previous version from March 8 was covered in this article: https://www.bioverlay.org/post/deepprofile-patient-aml-apr-2018/

The reviewers found it methodologically interesting, but I gather that there was some skepticism about its ability to lead to improved clinical care for AML.

I haven't made it back to the revision to see if the concerns (primarily from Reviewer 2) were addressed.

stephenra commented 6 years ago

@cgreene I believe the points from Reviewer 2 have largely gone unaddressed in the latest iteration. Preprocessing remains fairly convoluted and not particularly well detailed. For example, the point about batch effect correction with ComBat remains a nominal point of discussion (although ComBat is now noted, versus the older version). Imputation, if there is any, is not mentioned.

Here is the link to the previous version, dated March 8.

To Reviewer 1's point though, their code is now available, in addition to the integrated gradients method (Sundararajan et al. 2017) that they used for axiomatic attribution of their gene weights.

greenelab / deep-review

DeepProfile: Deep learning of cancer molecular profiles for precision medicine #874