Pytorch implementation of Divide-and-Rule (DnR) paper.
With the long-term rapid increase in incidences of colorectal cancer (CRC), there is an urgent clinical need to improve risk stratification. The conventional pathology report is usually limited to only a few histopathological features. However, most of the tumor microenvironments used to describe patterns of aggressive tumor behavior are ignored. In this work, we aim to learn histopathological patterns within cancerous tissue regions that can be used to improve prognostic stratification for colorectal cancer. To do so, we propose a self-supervised learning method that jointly learns a representation of tissue regions as well as a metric of the clustering to obtain their underlying patterns. These histopathological patterns are then used to represent the interaction between complex tissues and predict clinical outcomes directly. We furthermore show that the proposed approach can benefit from linear predictors to avoid overfitting in patient outcomes predictions. To this end, we introduce a new well-characterized clinicopathological dataset, including a retrospective collective of 374 patients, with their survival time and treatment information. Histomorphological clusters obtained by our method are evaluated by training survival models. The experimental results demonstrate statistically significant patient stratification and our approach outperformed state-of-the-art deep clustering methods.
The pre-trained models are available on the google drive link. We provide as well a small dataset to try the model by yourself. Put all the pth files in the same folder as:
DnR
|
├── README.md
├── dataset.py
├── dnr.py
├── run_dnr.py
├── samples.npy # Sample (H&E images) downloaded from link
└── model
├── dnr_model_state_ans.pth # model part I (from link)
├── dnr_model_state_npc.pth # model part II
└── dnr_model_state.pth # model part III
You can run the training using the command:
python run_dnr.py --db sample.npy --pretrained dnr_model_state --output .
Note that here the size of the memory bank is fixed (660474). This value need to be adapted and updated with the size of your data when training from scratch.
Here is an overview of the provided data as samples.npy
.
samples.npy
|
├── [0] # Random location within slides (coordinates and slides not provided)
| ├── image # (224x224x3) RGB crop of WSI at the consdered location
| ├── image_he: # (224x224x2) H&E version of the "image" crop
| ├── image_pairs: # (224x224x3) RGB crop overlapping with "image" crop
| ├── image_pairs_he: # (224x224x2) H&E version of the "image_pairs_he" crop
| └── idx_overall: # (int) Used intervally when developping the alogithm - Not used
|
├── [1] # Another location
| └── ...
|
└── [2] # Another location
| └── ...
...
If you use this work please use the following citation :).
@inproceedings{abbet2020divide,
title={Divide-and-Rule: Self-Supervised Learning for Survival Analysis in Colorectal Cancer},
author={Abbet, Christian and Zlobec, Inti and Bozorgtabar, Behzad and Thiran, Jean-Philippe},
booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
pages={480--489},
year={2020},
organization={Springer}
}