greenelab / deep-review

A collaboratively written review paper on deep learning, genomics, and precision medicine
https://greenelab.github.io/deep-review/
Other
1.25k stars 270 forks source link

Prospective identification of hematopoietic lineage choice by deep learning #252

Open agitter opened 7 years ago

agitter commented 7 years ago

https://doi.org/10.1038/nmeth.4182 (DOI pending http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.4182.html)

Differentiation alters molecular properties of stem and progenitor cells, leading to changes in their shape and movement characteristics. We present a deep neural network that prospectively predicts lineage choice in differentiating primary hematopoietic progenitors using image patches from brightfield microscopy and cellular movement. Surprisingly, lineage choice can be detected up to three generations before conventional molecular markers are observable. Our approach allows identification of cells with differentially expressed lineage-specifying genes without molecular labeling.

hammer commented 7 years ago

Code's at https://github.com/QSCD/HematoFatePrediction

gwaybio commented 7 years ago

Well described and benchmarked analysis of predicting lineage fates of hematopoietic stem cells (HSPCs) using a CNN-RNN architecture.

Biological Aspects

Computational Aspects

27 x 27 pixel of a cell (single patch) is passed through a series of convolutions to extract patch features and output a lineage score. Cells were manually tracked through their lineage, and this lineage (along with CNN learned patch features over time) was input into an LSTM.

The authors nicely demonstrate how their method can predict cell lineage up to 3 generations prior to fate commitment. However, after 3 generations, performance bottoms off (but not by much; but variance does increase).

CNN-RNN is compared to Random Forest and SVM and performance is not much better on training data. However, the authors note that the generalizability of the CNN-RNN is superior; possibly from increased regularization.

Also try to address black-box interpretability problem by observing what features have high importance in a random forest model and systematically eliminating these features from the CNN-RNN and observing how the system changes.

agitter commented 7 years ago

@gwaygenomics thanks for summarizing, this sounds like something worth including. Do you have any thoughts on which section would be appropriate? Does it fit in the section @AnneCarpenter wrote?

gwaybio commented 7 years ago

I agree that it fits in that section - added in #254!