HHHedo / IBMIL

CVPR 2023 Highlight
72 stars 9 forks source link

TCGA Dataset Training and Testing Distributions #16

Closed bryanwong17 closed 8 months ago

bryanwong17 commented 8 months ago

Hi, could you please share with me the distribution of slides used for training and testing in the TCGA dataset, along with their respective labels? From my understanding, the patches were obtained from the DSMIL authors. Is it correct that you followed their approach to create the distributions?

I noticed that it's mentioned there are a total of 836 training slides and 210 testing slides. However, upon examining the TEST_ID.csv file from this link, I observed that there are 214 testing slides. Could you provide clarification on this? And also which slides are used for training? Thank you!

HHHedo commented 8 months ago

I create the training and test slides follow the dsmil's code, and the results are here.

bryanwong17 commented 8 months ago

Thank you for your reply. But the results here contains all training and testing slides. Could you let me know which slides are used for training/testing?

HHHedo commented 8 months ago

Please refer to the code.

bryanwong17 commented 8 months ago

Thank you!

bryanwong17 commented 8 months ago

Just wanted to double-check: Does the label 0 represent LUAD, and 1 represent LUSC in this?