mdozmorov / TCGAsurvival

Scripts to analyze TCGA data
GNU General Public License v3.0
115 stars 41 forks source link

Scripts to extract TCGA data for survival analysis.

awesome-TCGA - Curated list of TCGA resources. For more cancer-related notes, see my Cancer_notes

Data description

Scripts are being transitioned to use the curatedTCGAData and TCGAutils packages. See also cBioPortalData R interface to TCGA and the cBioPortal API.

Paper

Ramos, Marcel, Ludwig Geistlinger, Sehyun Oh, Lucas Schiffer, Rimsha Azhar, Hanish Kodali, Ino de Bruijn et al. "Multiomic Integration of Public Oncology Databases in Bioconductor", JCO Clinical Cancer Informatics 1 (2020), https://doi.org/10.1200/cci.19.00119

Data preparation

First, get the data locally using misc/TCGA_preprocessing.R script.

Analysis examples

Analysis scripts

Legacy analyses

Misc scripts

misc folder

TCGA data

data.TCGA folder. Some data are absent from the repository because of large size - download through links.

OvarianCancerSubtypes

Sample annotations by ovarian cancer subtypes. https://github.com/aedin/OvarianCancerSubtypes

ProteinAtlas

Uhlen, Mathias, Cheng Zhang, Sunjae Lee, Evelina Sjöstedt, Linn Fagerberg, Gholamreza Bidkhori, Rui Benfeitas, et al. “A Pathology Atlas of the Human Cancer Transcriptome.” Science (New York, N.Y.) 357, no. 6352 (August 18, 2017). doi:10.1126/science.aan2507. http://science.sciencemag.org/content/357/6352/eaan2507

Supplementary material http://science.sciencemag.org/content/suppl/2017/08/16/357.6352.eaan2507.DC1

brca_mbcproject_wagle_2017

https://www.mbcproject.org/

The Metastatic Breast Cancer Project is a patient-driven initiative. This study includes genomic data, patient-reported data (pre-pended as PRD), medical record data (MedR), and pathology report data (PATH). All of the titles and descriptive text for the clinical data elements have been finalized in partnership with numerous patients in the project. As these data were generated in a research, not a clinical, laboratory, they are for research purposes only and cannot be used to inform clinical decision-making. All annotations have been de-identified. More information is available at www.mbcproject.org.

Data download: http://www.cbioportal.org/study?id=brca_mbcproject_wagle_2017#summary. Data includes 78 patients, 103 samples, sample-specific clinical annotations, Putative copy-number from GISTIC, MutSig regions

TCGA_Ovarian