mahmoodlab / SurvPath

Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction - CVPR 2024
108 stars 4 forks source link

the folder structure of dataset #1

Closed dawalisi closed 1 year ago

dawalisi commented 1 year ago

Dear authors,

I found your work on transcriptomics, histology, and multimodal fusion for classification tasks to be quite interesting. I would like to know more about the folder structure you used in your experiments. Specifically, I'm interested in understanding how you organized the different data modalities and their corresponding files.

Could you kindly provide some information or details regarding the folder structure you employed in your study? It would greatly help me in better understanding and replicating your experiments.

ajv012 commented 1 year ago

Hello,

Thanks for your interest in our work and apologies for the slow response.

All of the metadata related to transcriptomics is found in datasets_csv/metadata/. We use pathway compositions from 2 data sources, Reactome and MSigDB. The pathways that we selected to include (based on what transcriptomic data we had available from TCGA), can be found in datasets_csv/metadata/*_signatures.csv. Here hallmarks_signatures.csv are from MSigDB, xena_signatures.csv are from Reactome, and combine_signatures.csv are combining those two sources. For each disease model from the TCGA, we also provide the raw RNAseq for the genes that are used in the pathway analysis.

I hope this helps in answering your question. Feel free to ask any more questions regarding the data, model, or results.