dddavid4real / HistGen

[MICCAI 2024] Official Repo of "HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction"
Apache License 2.0
23 stars 0 forks source link

Dataset, model weight, source code for paper "HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction"

We are glad to announce that our paper is accepted by MICCAI2024!!

This repo contains the dataset, model weight, and source code for paper "HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction". We only support PyTorch for now. See our paper for a detailed description of HistGen.

HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction\ Zhengrui Guo, Jiabo Ma, Yingxue Xu, Yihui Wang, Liansheng Wang, and Hao Chen\ Paper: https://arxiv.org/abs/2403.05396

Highlight of our work


Overview of the proposed HistGen framework: (a) local-global hierarchical encoder module, (b) cross-modal context module, (c) decoder module, (d) transfer learning strategy for cancer diagnosis and prognosis.

Table of Contents



Follow this instruction to create conda environment and install necessary packages:

git clone https://github.com/dddavid4real/HistGen.git
cd HistGen
conda env create -f requirements.yml

HistGen WSI-report dataset

Our curated dataset could be downloaded from here. For original WSIs, please download from TCGA Data Portal using the case ids in the annotation file.

The structure of this fold is shown as follows:

HistGen WSI-report dataset/
|-- WSIs
|    |-- slide_1.svs
|    |-- slide_2.svs
|    ╵-- ...
|-- dinov2_vitl
|        |-- slide_1.pt
|        |-- slide_2.pt
|        ╵-- ...
╵-- annotation.json

in which WSIs denotes the original WSI data from TCGA, dinov2_vitl is the features of original WSIs extracted by our pre-trained DINOv2 ViT-L backbone, and annotation.json contains the diagnostic reports and case ids of their corresponding WSIs. Concretely, the structure of this file is like this:

    "train": [
            "id": "TCGA-A7-A6VW-01Z-00-DX1.1BC4790C-DB45-4A3D-9C97-92C92C03FF60",
            "report": "Final Surgical Pathology Report Procedure: Diagnosis A. Sentinel lymph node, left axilla ...",
            "image_path": [
            "split": "train"

    "val": [
            "id": "...",
            "report": "...",
            "image_path": ["..."],
            "split": "val"

    "test": [
            "id": "...",
            "report": "...",
            "image_path": ["..."],
            "split": "test"

in which we have already split into train/val/test subsets with ratio 8:1:1. Besides, "id" denotes the case id of this report's corresponding WSI, "report" is the full refined text obtained after our proposed report cleaning pipeline, and "image_path" could be just ignored.

To reproduce our proposed HistGen model, please download the dinov2_vitl directory together with annotation.json.

Preprocessing and Feature Extraction with Pre-trained DINOv2 ViT-L

WSI Preprocessing

In this work, we adpoted and further accelerated CLAM for preprocessing and feature extraction. We uploaded the minimal viable version of CLAM to this repo. For installation guide, we recommend to follow the original instructions here. To conduct preprocessing, please run the following commands:

cd HistGen
conda activate clam
sh patching_scripts/tcga-wsi-report.sh

Feature Extraction

To extract features of WSIs, please run the following commands:

cd HistGen
conda activate clam
sh extract_scripts/tcga-wsi-report.sh

in which we provide the ImageNet-pretrained ResNet, Ctranspath, PLIP, and our pre-trained DINOv2 ViT-L feature extractor. Note that Ctranspath requires specific timm environment, see here for more info.

🌟If Git LFS fails, please download the model checkpoint of our pre-trained DINOv2 feature extractor from this link. After downloading, put it under HistGen/CLAM/models/ckpts/ .

HistGen WSI Report Generation Model


To try our model for training, validation, and testing, simply run the following commands:

cd HistGen
conda activate histgen
sh train_wsi_report.sh

Before you run the script, please set the path and other hyperparameters in train_wsi_report.sh. Note that --image_dir should be the path to the dinov2_vitl directory, and --ann_path should be the path to the annotation.json file.


To generate reports for WSIs in test set, you can run the following commands:

cd HistGen
conda activate histgen
sh test_wsi_report.sh

Similarly, remember to set the path and other hyperparameters in test_wsi_report.sh.

Transfer to Downstream Tasks

In this paper, we consider WSI report generation task as an approach of vision-language pre-training, and we further fine-tune the pre-trained model on cancer subtyping and survival analysis tasks, with the strategy shown in Methodology subfigure (d). For the implementation of downstream tasks, we recommend to use the EasyMIL repository, which is a flexible and easy-to-use toolbox for multiple instance learning (MIL) tasks developed by our team.

We are currently organizing the pre-trained checkpoints and merging HistGen into EasyMIL. Please stay tuned for the update.


License and Usage

If you find our work useful in your research, please consider citing our paper at:

  title={HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction},
  author={Guo, Zhengrui and Ma, Jiabo and Xu, Yingxue and Wang, Yihui and Wang, Liansheng and Chen, Hao},
  journal={arXiv preprint arXiv:2403.05396},

This repo is made available under the Apache-2.0 License. For more details, please refer to the LICENSE file.