Zero-We / PMIL

Prototypical multiple instance learning for predicting lymph node metastasis of breast cancer from whole-slide pathological images
GNU General Public License v3.0
24 stars 4 forks source link

Prototypical multiple instance learning for predicting lymph node metastasis of breast cancer from whole-slide pathological images

Introduction

Computerized identification of lymph node metastasis (LNM) from whole-slide pathological images (WSIs) can largely benefit the therapy decision and prognosis of breast cancer. Besides the general challenges of computational pathology, including extra high resolution, very expensive fine-grained annotation and significant inter-tumoral heterogeneity, one particular difficulty with this task lies in identifying metastasized tumors with tiny foci (called micro-metastasis). In this study, we introduce a weakly supervised method, called Prototypical Multiple Instance Learning (PMIL), to learn to predict lymph node metastasis of breast cancer from whole slide pathological images with only slide-level class labels. Firstly, PMIL discovers a collection of so-called prototypes from the training data by unsupervised clustering. Secondly, the prototypes are matched against the constitutive patches in the WSI, and the resultant similarity scores are aggregated into a soft-assignment histogram describing the statistical distribution of the prototypes in the WSI, which is taken as the slide features. Finally, WSI classification is accomplished by using the slide features.

Model

The trained model weights and precomputed patch feature vectors are provided here ([Google Drive] | [Baidu Cloud] (fzts)). You can download these files and drag pmil_model.pth and pmil_model_simclr.pth to the model directory, drag mil-feat and simclr-feat to the feat directory.

Dataset

Training

The patch-level feature encoder will be initialized by training the standard instance-space MIL with max-pooling. Part of our code refer to: (Campanella et al., 2019), you can refer to here. And the input data should be stored in dictionary with torch.save() in .ckpt file format including following keys:

You can run following command to train the standard MAX-MIL model and extract the feature vectors of each patch simultaneously:

python max-mil.py --save_model --save_index --save_feat


Affinity propagation clustering algorithm is used to capture the typical pathological patterns, which we call prototypes. To obtain the prototypes on Camelyon16 dataset, you can run following command:

python cluster.py


Train the PMIL framework that encodes WSI by its compositions in terms of the frequencies of occurence of prototypes found inside. Here, we use patch features match against prototypes to get soft-assignment histogram, and histograms of each patch in WSI will be aggregated by selective pooling module:

python pmil.py --save_model


Inference

You can evaluate the performance of PMIL at 40x magnification on Camelyon16 dataset by following command:

python pmil.py --load_model --is_test


Visualization

We illustare the prototype discovery on Camelyon16 dataset here. The above row of images show the discovered prototypes, and the colors of bounding boxes are matched with the colors of each cluster in the below row. The below shows intra-slide patch clustering results on two WSIs, the left is LNM-positive and the right is LNM-negative.


Interpretablity is important to deep learning based algorithms for medical applications, fow which MIL methods often utilize a so-called heatmap to visualize the contribution of each location in a WSI to the classification decision. And we also illustrate the attention maps obtained by PMIL in the vis directory. We can observe that, the attention map can completely highlight the tumor regions, which are consistent with the ground truth annotations.


License

This code is made available under the GPLv3 License and is available for non-commercial academic purposes.

Citation

If you find our work useful in your research or if you use parts of this code please consider citing our paper.

@article{yu2023prototypical,
  title={Prototypical multiple instance learning for predicting lymph node metastasis of breast cancer from whole-slide pathological images},
  author={Yu, Jin-Gang and Wu, Zihao and Ming, Yu and Deng, Shule and Li, Yuanqing and Ou, Caifeng and He, Chunjiang and Wang, Baiye and Zhang, Pusheng and Wang, Yu},
  journal={Medical Image Analysis},
  pages={102748},
  year={2023},
  publisher={Elsevier}
}