Official repository for Self-Supervised driven Consistency Training for Annotation Efficient Histopathology Image Analysis. Published in Medical Image Analysis (MedIA), October, 2021. [Journal Link] [arXiv preprint]
A new version of our RSP based SSL pretraining [Pretraining_v2] has been released with Randaugment technique on TIGER Challenge dataset (https://tiger.grand-challenge.org).
We propose a self-supervised driven consistency training paradigm for histopathology image analysis that learns to leverage both task-agnostic and task-specific unlabeled data based on two strategies:
A self-supervised pretext task that harnesses the underlying multi-resolution contextual cues in histology whole-slide images (WSIs) to learn a powerful supervisory signal for unsupervised representation learning.
A new teacher-student semi-supervised consistency paradigm that learns to effectively transfer the pretrained representations to downstream tasks based on prediction consistency with the task-specific unlabeled data.
We carry out extensive validation experiments on three histopathology benchmark datasets across two classification and one regression-based tasks:
We compare against the state-of-the-art self-supervised pretraining methods based on generative and contrastive learning techniques: Variational Autoencoder (VAE) and Momentum Contrast (MoCo), respectively.
Predicted tumor cellularity (TC) scores on BreastPathQ test set for 10% labeled data
Predicted tumor probability on Camelyon16 test set for 10% labeled data
Core implementation:
Additional packages can be installed via: requirements.txt
The model training consists of three stages:
Resolution sequence prediction (RSP)
task)SSL
)SSL_CR
)From the file "pretrain_BreastPathQ.py / pretrain_Camelyon16.py", you can pretrain the network (ResNet18) for predicting the resolution sequence ordering in WSIs on BreastPathQ & Camelyon16 dataset, respectively. This can be easily adapted to any other dataset of choice.
python pretrain_BreastPathQ.py // Pretraining on BreastPathQ
python pretrain_Camelyon16.py // Pretraining on Camelyon16
We also provided the pretrained models for BreastPathQ and Camelyon16, found in the "Pretrained_models" folder. These models can also be used for feature transferability (domain adaptation) between datasets with different tissue types/organs.
A new version of RSP (version-2) pretraining has been implemented with Randaugment technique [Pretraining_v2] on TIGER Challenge dataset (https://tiger.grand-challenge.org).
From the file "eval_BreastPathQ_SSL.py / eval_Camelyon_SSL.py / eval_Kather_SSL.py", you can fine-tune the network (i.e., task-specific supervised fine-tuning) on the downstream task with limited label data (10%, 25%, 50%). Refer to, paper for more details.
python eval_BreastPathQ_SSL.py // Supervised fine-tuning on BreastPathQ
python eval_Camelyon_SSL.py // Supervised fine-tuning on Camelyon16
python eval_Kather_SSL.py // Supervised fine-tuning on Kather dataset (Colorectal)
Note: we didn't perform self-supervised pretraining on the Kather dataset (colorectal) due to the unavailability of WSI's. Instead, we performed domain adaptation by pretraining on Camelyon16 and fine-tuning on the Kather dataset. Refer to, paper for more details.
From the file "eval_BreastPathQ_SSL_CR.py / eval_Camelyon_SSL_CR.py / eval_Kather_SSL_CR.py", you can fine-tune the student network by keeping the teacher network frozen via task-specific consistency training on the downstream task with limited label data (10%, 25%, 50%). Refer to, paper for more details.
python eval_BreastPathQ_SSL_CR.py // Consistency training on BreastPathQ
python eval_Camelyon_SSL_CR.py // Consistency training on Camelyon16
python eval_Kather_SSL_CR.py // Consistency training on Kather dataset (Colorectal)
The test performance is validated at two stages:
Self-Supervised pretraining followed by supervised fine-tuning
Consistency training
The prediction on Camelyon16 test set can be performed using "test_Camelyon16.py" file.
Our code is released under MIT license.
If you find our work useful in your research or if you use parts of this code please consider citing our paper:
@article{srinidhi2022self,
title={Self-supervised driven consistency training for annotation efficient histopathology image analysis},
author={Srinidhi, Chetan L and Kim, Seung Wook and Chen, Fu-Der and Martel, Anne L},
journal={Medical Image Analysis},
volume={75},
pages={102256},
year={2022},
publisher={Elsevier}
}
This work was funded by Canadian Cancer Society and Canadian Institutes of Health Research (CIHR). It was also enabled in part by support provided by Compute Canada (www.computecanada.ca).
RSP (Version-2) pretraining code with Randaugment technique has been inspired by "Tailoring automated data augmentation to H&E-stained histopathology", MIDL 2021 (https://github.com/DIAGNijmegen/pathology-he-auto-augment). Please, do cite this paper if you use this code.
Please direct any questions or comments to me; I am happy to help in any way I can. You can email me directly at srinidhipy@gmail.com.