I want to extract some mutation data and have been following the following documentation: https://github.com/BioinformaticsFMRP/TCGAbiolinks/issues/new. However, even following the TCGA-CHOL example in the documentation, GDCquery() is reporting multiple files per case, and so GDCprepare(query) does not work. When I looked closer, it seemed that the case names are missing.
Here is the code and output received from GDCquery()
query <- GDCquery(
project = "TCGA-CHOL",
data.category = "Simple Nucleotide Variation",
access = "open",
legacy = FALSE,
data.type = "Masked Somatic Mutation",
workflow.type = "Aliquot Ensemble Somatic Variant Merging and Masking")
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg38
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-CHOL
--------------------
oo Filtering results
--------------------
ooo By access
ooo By data.type
ooo By workflow.type
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
Warning: There are more than one file for the same case. Please verify query results. You can use the command View(getResults(query)) in rstudio
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
And just to compare cases from the query results:
table(getResults(query)$cases)
51
Is there any way of resolving this? Thanks in advance!
It is working for me: https://rpubs.com/tiagochst/TCGAbiolinks_issue_520
The "duplicated cases" is just a warning, since for each MAF file both matched tumor and normal samples were used to produce the file.
Hi there,
I want to extract some mutation data and have been following the following documentation: https://github.com/BioinformaticsFMRP/TCGAbiolinks/issues/new. However, even following the TCGA-CHOL example in the documentation, GDCquery() is reporting multiple files per case, and so GDCprepare(query) does not work. When I looked closer, it seemed that the case names are missing.
Here is the code and output received from GDCquery()
And just to compare cases from the query results:
Is there any way of resolving this? Thanks in advance!