DearCaat / MHIM-MIL

[ICCV 2023 Oral] Multiple Instance Learning Framework with Masked Hard Instance Mining for Whole Slide Image Classification
49 stars 3 forks source link

Filenames of TCGA Lung Cancer #6

Closed weiaicunzai closed 6 months ago

weiaicunzai commented 6 months ago

Thanks for your great work, sorry to bother you. Since you have uploaded the filenames and the labels of CAM16 dataset, could you also please upload the TCGA Lung Cancer filenames and corresponding labels you have used? Thanks in advance.

DearCaat commented 6 months ago

Thank you for your attention to my work, I am happy to answer these questions and discuss the details. As for the TCGA-NSCLC dataset, it is used to sub-type TCGA-LUAD and TCGA-LUSC. These two are sub-projects in TCGA, you only need to download the data of these two sub-projects, and the images of each sub-project are separate. Therefore, I think it may not need a label file.

weiaicunzai commented 6 months ago

Thank you for your attention to my work, I am happy to answer these questions and discuss the details. As for the TCGA-NSCLC dataset, it is used to sub-type TCGA-LUAD and TCGA-LUSC. These two are sub-projects in TCGA, you only need to download the data of these two sub-projects, and the images of each sub-project are separate. Therefore, I think it may not need a label file.

Thanks, but if I'm correct, the case and slide numbers are different from the paper . I'm guessing its due to the consistent updating of this project? For example, the case and slide numbers of TCGA-LUAD dataset are 514 and 1067 from the website you have give, different from the number you have stated in the paper, and I quote: " There are diagnostic slides, LUAD with 541 slides from 478 cases," from your paper.

image

weiaicunzai commented 6 months ago

Sorry, should I use the Diagnostic Slide instead? It seems that I have ignored the Diagnostic Slide, My bad.

DearCaat commented 6 months ago

Thank you for your attention to my work, I am happy to answer these questions and discuss the details. As for the TCGA-NSCLC dataset, it is used to sub-type TCGA-LUAD and TCGA-LUSC. These two are sub-projects in TCGA, you only need to download the data of these two sub-projects, and the images of each sub-project are separate. Therefore, I think it may not need a label file.

I have upload the TCGA-NSCLC label file. See file.