greenelab / deep-review

A collaboratively written review paper on deep learning, genomics, and precision medicine
https://greenelab.github.io/deep-review/
Other
1.25k stars 270 forks source link

Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images #890

Open evancofer opened 6 years ago

evancofer commented 6 years ago

Beyond sample curation and basic pathologic characterization, the digitized H&E-stained images of TCGA samples remain underutilized. To highlight this resource, we present mappings of tumor-infiltrating lymphocytes (TILs) based on H&E images from 13 TCGA tumor types. These TIL maps are derived through computational staining using a convolutional neural network trained to classify patches of images. Affinity propagation revealed local spatial structure in TIL patterns and correlation with overall survival. TIL map structural patterns were grouped using standard histopathological parameters. These patterns are enriched in particular T cell subpopulations derived from molecular measures. TIL densities and spatial structure were differentially enriched among tumor types, immune subtypes, and tumor molecular subtypes, implying that spatial infiltrate state could reflect particular tumor cell aberration states. Obtaining spatial lymphocytic patterns linked to the rich genomic characterization of TCGA samples demonstrates one use for the TCGA image archives with insights into the tumor-immune microenvironment.

https://doi.org/10.1016/j.celrep.2018.03.086

gwaybio commented 5 years ago

Summary

The authors present a "computational staining" approach for H&E stained images from The Cancer Genome Atlas. There are a few samples in TCGA that have corresponding pathology images, and the authors use these samples to train a CNN to detect tumor infiltrating lymphocytes (TILs) (n = 5,455 across 13 cancer-types). Increased TILs is associated with increased survival and the spatial context of the TILs is important. The authors use the computational staining approach to describe lymphocyte clustering patterns and to uncover relationships between tumor regions across cancer-types.

Computational Aspects

The authors train a two pronged CNN on two distinct tasks: 1) Lymphocyte Detection, 2) Necrosis Segmentation. The lymphocyte detection CNN is initialized with weights predetermined by a convolutional autoencoder (CAE) with 13 encoding layers and 3 layers of max pooling. The CAE encodes position, appearance, and morphology of nuclei. The authors show that CAE pretraining improves accuracy. The authors ditch the reconstruction aspect of the CAE and plug the feature representations as the input to the rest of the CNN architecture (an additional 14 conv layers, 3 max pooling, 1 fully connected). The Necrosis Segmentation is the DeconvNet architecture used to predict labeled necrosis pixels. For each CNN the authors also use several data augmentation strategies like rotations and color perturbation.

The most interesting computational aspect of this paper, in my opinion, was the loss function. Basically, the loss function incorporated an expert pathologist's review of the computational staining results. Three pathologists reviewed select outputs and toggled lymphocyte detection thresholds and retrained after updating labels.

Also, the specifically trained CNN outperforms VGG16 (fine tuned after image net pretraining) architectureby only 0.0312 AUROC (0.9544 vs. 0.9232).

Biological Aspects

The authors show a number of really cool biological insights. To list a few:

  1. Computational staining correlates with expert pathologists' review
  2. High heterogeneity across TCGA tumor types and subtypes with uveal melanoma having low TILs (negative control)
  3. Variable correlation (spearman's rho 0.1 - 0.45) between CIBERSORT TIL cellular fraction and computational staining
  4. Interesting analysis on the clustering patterns of TILs - turns out the distribution of TILs in a tumor is non-random (e.g. sometimes dispersed, often cluster in groups or around margins)
  5. These observations and clustering patterns are associated with survival in breast and skin cancer. Authors speculate it could be a result of checkpoint therapy success in melanoma but not breast cancer.
  6. Application of their method to distinguishing 4 previously defined subtypes ("Brisk Diffuse", "Brisk Band-Like", "Non-Brisk Multifocal", "Non-Brisk Focal")

Software Availability

It is available, but was actually a bit difficult to find. Navigated here: https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=33948919 then to the FAQ, which directed me to the Github page: https://github.com/SBU-BMI/u24_lymphocyte