BioinformaticsFMRP / TCGAbiolinks

TCGAbiolinks
http://bioconductor.org/packages/devel/bioc/vignettes/TCGAbiolinks/inst/doc/index.html
286 stars 109 forks source link

Gene level copy number doesn't match GDC website #559

Open ytakemon opened 1 year ago

ytakemon commented 1 year ago

Hello,

I am trying to download gene level copy number data from GDC for TCGA-COAD cases using the following code. I noticed what is called "DeepDel" on the web portal shows as a copy number of 1 in the downloaded TCGAbiolinks downloaded data. I was wondering if I am interpreting this data incorrectly or if I might have downloaded the wrong data for this?

Eg. Copy number data for the gene WRN in the TCGA-COAD sample: TCGA-G4-6320:

https://www.cbioportal.org/patient/summary?studyId=coadread_tcga_pan_can_atlas_2018&caseId=TCGA-G4-6320 The web portal shows WRN has a copy number allele as "DeepDel" (should be copy number of 0)

However, with the following code, I get copy number of 1:

library(pacman)
p_load(tidyverse, TCGAbiolinks)

query <- GDCquery(
    project = projects$project_id[i], 
    data.category = "Copy Number Variation", 
    access = "open", 
    legacy = FALSE, 
    data.type = "Gene Level Copy Number") 

  query <- GDCquery(project = "TCGA-COAD",
                    data.category = "Copy Number Variation",
                    data.type = "Gene Level Copy Number Scores",              
                    access="open", 
                    legacy = F)

  GDCdownload(query)
  cn <- GDCprepare(query)

SummarizedExperiment::assay(cn) %>% as_tibble(rownames = "Ens.id") %>% 
  pivot_longer(-Ens.id, names_to = "TCGA_id_full", values_to = "cn") %>%
  filter(str_detect(Ens.id, "ENSG00000165392"), str_detect(TCGA_id_full, "TCGA-G4-6320")) 

# A tibble: 1 × 3
  Ens.id             TCGA_id_full
  <chr>              <chr>
1 ENSG00000165392.11 TCGA-G4-6320-01A-11D-1717-01,TCGA-G4-6320-10A-01D-1718-01
     cn
  <dbl>

R version 4.2.2 (2022-10-31) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core) TCGAbiolinks_2.25.3