Closed bblodfon closed 2 years ago
Hi John, @bblodfon These are Segment_Mean values and are reduced with a weightedmean function. I've updated the documentation with details. https://github.com/waldronlab/TCGAutils/commit/dd538820f8e3a83d6023b69b6e61fd1b3960e6a5
I couldn't quickly find the documentation for the Broad Firehose pipeline but I saw that https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/CNV_Pipeline/ has
The GDC further transforms these copy number values into segment mean values, which are equal to log2(copy-number/ 2). Diploid regions will have a segment mean of zero, amplified regions will have positive values, and deletions will have negative values.
Hi,
I would like to better understand what the CNA values are exactly and how they are transformed via
simplifyTCGA()
for a specific TCGA study. Is there documentation about these somewhere?For example, check the following two matrices:
cna_snp_mat1
(genomic regions (rows) x patient samples (columns)) - what are the these values?cna_snp_mat2
(genes/others (rows) x patient samples (columns)) - how are these transformed from the above (I think the code is this one). I am particularly interested in interpreting these values, i.e. does lower/negative values correspond to deletion and higher/positive to amplification somehow?