Xiyue-Wang / RetCCL

GNU General Public License v3.0
96 stars 8 forks source link

RetCCL: Clustering-guided contrastive learning for whole-slide image retrieval (Medical Image Analysis)

Journal Link

Please open new threads or address all questions to xiyue.wang.scu@gmail.com

A better and stronger pre-trained model was built for various histopathological image applications. This model outperforms ImageNet pre-trained features by a large margin. We release our best model and invite researchers to test it on your computational pathology tasks.

Hardware

Preparations

1.Download all TCGA 32000 WSIs.

2.Download all PAIP 2,457 WSIs. So, there will be about 15,000,000 images(~100T). It costs us $400,000 to advance the progress of digital pathology.

Pre-trained models for histopathological image tasks

This pre-train model is here

1.Classification through search

It is the most obvious and direct way to evaluate the distinctive power of the provided features.

TissueNet
Acc@1 Acc@3 Acc@5 mMV@5
ImageNet 50.35 77.65 87.68 46.15
CCL (ours) 67.09 87.81 93.4 70.1
UniToPatho
Acc@1 Acc@3 Acc@5 mMV@5
ImageNet 58.17 82.89 89.45 59.01
CCL (ours) 66.55 84.32 90.31 68.35

2.Multiple Instance Learning for Whole Slide Image Classification

This task is currently based on ImageNet pretrained features, which can also verify the superiority of our feature extractor.

TCGA-NSCLC
Accuracy AUC
ABMIL 0.7719 0.8656
MIL-RNN 0.8619 0.9107
DSMIL 0.8058 0.8925
TransMIL 0.8835 0.9603
CLAM 0.8422 0.9377
CLAM+CCL (ours) 0.911 0.967

3.Classification based on features using SVM

This task follows KimiaNet

Colorectal cancer dataset
Accuracy
Combined features 87.40
Fine-tuned VGG-19 86.19
Ensemble of CNNs 92.83
KamiaNet 96.80
CCL (ours) 98.40

If you want to compute the features.

python get_feature.py

It is recommended to first try to extract features at 1.0mpp, and then try other magnifications

If you want to fine-tune model.

python resnet_lincls.py

Whole-Slide Images retrieval

You can refer to the third-party reproduction paper and code.

Please refer to FISH, when clustering and searching, use our features, then remove the Tree and search directly

License

RetCCL is released under the GPLv3 License and is available for non-commercial academic purposes.

Citation

Please use below to cite this paper if you find our work useful in your research.

@article{WANG2023102645,
title = {RetCCL: Clustering-guided contrastive learning for whole-slide image retrieval},
author = {Xiyue Wang and Yuexi Du and Sen Yang and Jun Zhang and Minghui Wang and Jing Zhang and Wei Yang and Junzhou Huang and Xiao Han},
journal = {Medical Image Analysis},
volume = {83},
pages = {102645},
year = {2023},
issn = {1361-8415}
}