Open wk5475 opened 1 year ago
Hi @wk5475, I have the same question. I found some description in the supplementary materials of this paper:
TCGA Lung Cancer. Lung Adenocarcinoma (LUAD) and Lung Squamous Cell Carcinoma (LUSC) are two sub-type of cancers in the TCGA lung cancer dataset, with 534 LUAD and 512 LUSC slides, respectively. There are only slide-level labels available for this dataset. Compared to CAMELYON-16, tumor regions in tumor slides are signifi- cantly larger in this dataset.
LUAD and LUSC can be directly downloaded from the TCGA website (e.g., https://portal.gdc.cancer.gov/projects/TCGA-LUSC). The problem is the result in TCGA paper seems quite different from another work https://arxiv.org/pdf/2301.08125.pdf (This may be due to the different dataset split tho).
Could you please provide more details on generating TCGA datasets?