Closed lingxitong closed 8 months ago
thanks for attention. Split of TCGA is according to the case, following TransMIL and DTFD-MIL. the label.csv of TCGA is the case id not the slide id, so i can ensure the slides from same case not appear in different dataset. code of split dataset, codes of load slide features
thanks for your quick reply!
thanks for attention. Split of TCGA is according to the case, following TransMIL and DTFD-MIL. the label.csv of TCGA is the case id not the slide id, so i can ensure the slides from same case not appear in different dataset. code of split dataset, codes of load slide features
Hi, @DearCaat The split ratio is 65:10:25 at the case level. Is this the same ratio used at the file level, or is the file level ratio not taken into consideration?
Hi, @akidway The file ratio is not taken into consideration. Some cases have only one file, but some have more.
Got it. Thank you for quick reply!
hello,nice work,I have a question,for tcga dataset,slide and case are not one2one,so the 65:10:25 is split according to the slide or the case?wheather we need ensure the slides from same case not appear in different dataset(train val test)?