Boomwwe / SOTA_MSI_prediction

GNU General Public License v3.0
2 stars 0 forks source link

SOTA_MSI_prediction

standard-readme compliant

We developed an efficient workflow for biomarkers in CRC (MSI, hypermutation, chromosomal instability, CpG island methylator phenotype, BRAF, and TP53 mutation) that required relatively small datasets, but achieved a state-of-the-art (SOTA) predictive performance.

image

Table of Contents

The orginal code of this step is from kather lab. And we modify it to generate tiles more easily.

$ python extractTiles.py -s slide_path -o out_path -ps pic_save_path

Pre-training_model

A tiny Swin-T model was pre-trained to develop a multiclass tissue classifier. The tissue classifier was trained and tested using two publicly available pathologist-annotated datasets (NCT-CRC-HE-100K and CRC-VAL-HE-7K) from Kather et al.. These datasets consist of CRC image tiles of nine tissue types: adipose tissue (ADI), background (BACK), debris (DEB), lymphocytes (LYM), mucus (MUC), smooth muscle (MUS), normal colon mucosa (NORM), cancer-associated stroma (STR), and colorectal adenocarcinoma epithelium (TUM)

$ python Pretrain.py -tr train_dir -te test_dir -sp save_path

Color_normalization

The image tiles were color-normalized using Macenko’s method to reduce the color bias and improve classifier performance and were subsequently resized to 224×224 px to serve as the input of the network. The orginal code of this step is from Li et al.

$ python color_normalize.py -i input_dir -o output_dir

Tumor_selection

The pre-trained tissue classifier was trained to detect and select tiles with tumor tissue.

$ python select_tumor.py -i input_dir -o output_dir -mp model_path

Training_model

The pre-trained Swin-T model (tissue classifier) was fine-tuned for the binary classification of key CRC biomarkers at the patient (slide) level

$ python training.py -cv cv_dir -pp pic_dir -lp label_path -sp save_path

Visualization

The interpretability of the Swin-T models was explored using visualization technology with Python package pytorch-grad-cam. image

Citation

If you use this for research, please cite. Here is an example BibTeX entry:

@article{guo2023predicting,
  title={Predicting microsatellite instability and key biomarkers in colorectal cancer from H\&E-stained images: achieving state-of-the-art predictive performance with fewer data using Swin Transformer},
  author={Guo, Bangwei and Li, Xingyu and Yang, Miaomiao and Jonnagaddala, Jitendra and Zhang, Hong and Xu, Xu Steven},
  journal={The Journal of Pathology: Clinical Research},
  volume={9},
  number={3},
  pages={223--235},
  year={2023},
  publisher={Wiley Online Library}
}