Question about CNA data

waldronlab / curatedTCGAData

Curated Data From The Cancer Genome Atlas (TCGA) as MultiAssayExperiment Objects

41 stars 7 forks source link

Hi,

I would like to better understand what the CNA values are exactly and how they are transformed via simplifyTCGA() for a specific TCGA study. Is there documentation about these somewhere?

For example, check the following two matrices:

cancer_data = curatedTCGAData(diseaseCode = 'PAAD', assays = '*', version = '2.0.1', dry.run = FALSE)
cancer_data_simplified = TCGAutils::simplifyTCGA(cancer_data)

cna_snp_mat1 = t(assay(cancer_data[,,"PAAD_CNASNP-20160128"]))
cna_snp_mat2 = t(assay(cancer_data_simplified[,,"PAAD_CNASNP-20160128_simplified"]))

cna_snp_mat1 (genomic regions (rows) x patient samples (columns)) - what are the these values?
cna_snp_mat2 (genes/others (rows) x patient samples (columns)) - how are these transformed from the above (I think the code is this one). I am particularly interested in interpreting these values, i.e. does lower/negative values correspond to deletion and higher/positive to amplification somehow?

waldronlab / curatedTCGAData

Question about CNA data #52