miccaiif / DGMIL

Official PyTorch implementation of our MICCAI 2022 paper: DGMIL: Distribution Guided Multiple Instance Learning for Whole Slide Image Classification.
32 stars 3 forks source link

How to define the negative slides in the TCGA dataset? #3

Closed qyn0729 closed 2 years ago

qyn0729 commented 2 years ago

Hi! Thanks for your great work! I noticed that there is no completely negative slide in the TCGA dataset. They are only labeled as LUAD or LUSC. May I ask how did you train the "Pseudo Label-Based Feature Space Refinement" part with TCGA data? Thanks so much!

miccaiif commented 2 years ago

Hello!Thanks for your attention!

The TCGA Lung Cancer dataset includes a total of 1054 WSIs from the Cancer Genome Atlas (TCGA) Data Portal, which includes two sub-types of lung cancer, namely Lung Adenocarcinoma and Lung Squamous Cell Carcinoma. Our goal is to accurately perform the diagnosis of both subtypes, where WSIs of Lung Adenocarcinoma are labeled as negative and WSIs of Lung Squamous Cell Carcinoma are labeled as positive.

The experiment settings are in line with DSMIL [1]. You can refer to this paper for further details.

[1] Li B, Li Y, Eliceiri K W. Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 14318-14328.