lin-lcx / H2-MIL

27 stars 2 forks source link

About Dataset #1

Closed guolalala closed 1 year ago

guolalala commented 2 years ago

Hi, appreciate for your work. I tried but I can not find the way to download the dataset(ESCA and KICA) you used in your work, could you share the data with me? It's very helpful, thank you!

Bontempogianpaolo1 commented 1 year ago

Hi! I have the same issue! @lin-lcx could you give us more instructions to replicate your work? thank you!

lin-lcx commented 1 year ago

The dataset comes from TCGA (https://portal.gdc.cancer.gov/repository?facetTab=files&filters=%7B%22op%22%3A%22and%22%2C%22content%22%3A%5B%7B%22op% 22%3A%22in%22%2C%22content%22%3A%7B%22field%22%3A%22cases.project.program.name%22%2C%22value%22%3A%5B%22TCGA%22%5D% 7D%7D%2C%7B%22op%22%3A%22in%22%2C%22content%22%3A%7B%22field%22%3A%22cases.project.project_id%22%2C%22value%22%3A% 5B%22TCGA-ESCA%22%5D%7D%7D%2C%7B%22op%22%3A%22in%22%2C%22content%22%3A%7B%22field%22%3A%22files.data_type%22% 2C%22value%22%3A%5B%22Slide%20Image%22%5D%7D%7D%5D%7D).

You can use the gdc-client tool provided by the TCGA official website to download. I also uploaded it to github. Download the Manifest file corresponding to the data you need from the TCGA official website, and then use the command (gdc-client download -m gdc_manifest.txt -d out_directory) to download it. More detailed tutorials can be found on the Internet.

Bontempogianpaolo1 commented 1 year ago

I think publishing the manifest directly can help reproduce the results more easily. Please consider this advice in the future. Thank you!

Bontempogianpaolo1 commented 1 year ago

moreover, if I search for KICA on the GDA portal nothing happens! @lin-lcx

JY9898 commented 1 year ago

Can you share the processed data? Like the graph after preprocessed? That would be very helpful!