BioinformaticsFMRP / TCGAbiolinks

TCGAbiolinks
http://bioconductor.org/packages/devel/bioc/vignettes/TCGAbiolinks/inst/doc/index.html
289 stars 110 forks source link

Duplicated samples for THCA RPPA #533

Open rhughwhite opened 2 years ago

rhughwhite commented 2 years ago
query <- GDCquery(project = 'TCGA-THCA', data.category = 'Proteome Profiling', data.type = 'Protein Expression Quantification', access = 'open', legacy = FALSE)

GDCdownload(query = query, directory=getwd())

data <- GDCprepare(query = query, summarizedExperiment=T, directory=getwd())
|    |cases            |experimental_strategy       |
|:---|:----------------|:---------------------------|
|140 |TCGA-CE-A3MD-01A |Reverse Phase Protein Array |
|265 |TCGA-CE-A3MD-01A |Reverse Phase Protein Array |
|16  |TCGA-FY-A3ON-01A |Reverse Phase Protein Array |
|59  |TCGA-FY-A3ON-01A |Reverse Phase Protein Array |
Error in GDCprepare(query = query, summarizedExperiment = T, directory = download.directory) : 
  There are samples duplicated. We will not be able to prepare it

I'm not able to identify any differences between the duplicated samples e.g. for case TCGA-CE-A3MD-01A:

ID: a2b68543-7bea-48fb-b27f-a838cd5a6e8a, file: TCGA-CE-A3MD-01A-22-A301-20_RPPA_data.tsv, GDC portal page: https://portal.gdc.cancer.gov/files/a2b68543-7bea-48fb-b27f-a838cd5a6e8a ID: a3a16f15-760b-4af8-90d6-d4d35fdd0cd4, file: TCGA-CE-A3MD-01A-21-A21L-20_RPPA_data.tsv, GDC portal page: https://portal.gdc.cancer.gov/files/a3a16f15-760b-4af8-90d6-d4d35fdd0cd4

Thanks!