DIAGNijmegen / pathology-he-auto-augment

H&E tailored Randaugment: automatic data augmentation policy selection for H&E-stained histopathology.
Apache License 2.0
52 stars 7 forks source link

Could you provide the dataset you used in your paper to reproduce the results? #1

Closed Lotus-95 closed 2 years ago

Lotus-95 commented 2 years ago

Thanks for your great job. The code not includes the dataset in your experiments. So could you provide the dataset?

KhrystynaFaryna commented 2 years ago

Hi, in this paper we used Camelyon17 dataset. It is a public dataset that can be downloaded from here: https://camelyon17.grand-challenge.org/Data/ . In this study we only used cases that have lesion-level annotations (10 training slides from every medical centre).

Lotus-95 commented 2 years ago

could you provide your patches croped from Camelyon 17 or the code to crop the patches?

KhrystynaFaryna commented 2 years ago

We use ASAP to crop patches. Here is our data split. You can check more about how we arrange data in the paper.

 #RUMC
    rumc_split_patients = {
        'training': ['patient_060_node_3', 'patient_066_node_2', 'patient_073_node_1', 'patient_072_node_0'],
        'validation': ['patient_064_node_0', 'patient_061_node_4', 'patient_075_node_4'],
        'test': ['patient_062_node_2', 'patient_067_node_4', 'patient_068_node_1']
    }

    #CWH
    cwh_split_patients = {
        'test': ['patient_004_node_4', 'patient_009_node_1', 'patient_010_node_4',
               'patient_012_node_0', 'patient_015_node_1', 'patient_015_node_2',
               'patient_016_node_1', 'patient_017_node_1', 'patient_017_node_2',
               'patient_017_node_4']
    }

    #RH
    rh_split_patients = {
        'test': ['patient_020_node_2', 'patient_020_node_4', 'patient_021_node_3',
               'patient_022_node_4', 'patient_024_node_1', 'patient_024_node_2',
               'patient_034_node_3', 'patient_036_node_3', 'patient_038_node_2',
               'patient_039_node_1']
    }

    #UMCU
    umcu_split_patients = {
        'test': ['patient_040_node_2', 'patient_041_node_0', 'patient_042_node_3',
               'patient_044_node_4', 'patient_045_node_1', 'patient_046_node_3',
               'patient_046_node_4', 'patient_048_node_1', 'patient_051_node_2',
               'patient_052_node_1']
    }

    #LPE
    lpe_split_patients = {
        'test': ['patient_080_node_1', 'patient_081_node_4', 'patient_086_node_0',
               'patient_086_node_4', 'patient_087_node_0', 'patient_088_node_1',
               'patient_089_node_3', 'patient_092_node_1', 'patient_096_node_0',
               'patient_099_node_4']

I uploaded some utility scripts here, maybe they will be helpful. You would need to install ASAP to use these scripts.