BioinformaticsFMRP / TCGAbiolinks

TCGAbiolinks
http://bioconductor.org/packages/devel/bioc/vignettes/TCGAbiolinks/inst/doc/index.html
289 stars 110 forks source link

Downloading Mutation data (hg19) for a cancer type #372

Open beginner984 opened 4 years ago

beginner984 commented 4 years ago

Hi

I am trying to get Mutation data (hg19) for ESCA cancer in mafformat; I have done so but I am getting error Can you help me please

> query.maf.hg19 <- GDCquery(project = "TCGA-ESCA", 
+                            data.category = "Simple nucleotide variation", 
+                            data.type = "Simple somatic mutation",
+                            access = "open", 
+                            legacy = TRUE)
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-ESCA
--------------------
oo Filtering results
--------------------
ooo By access
ooo By data.type
----------------
oo Checking data
----------------
ooo Check if there are duplicated cases
ooo Check if there results for the query
-------------------
o Preparing output
-------------------
> View(query.maf.hg19[[1]][[1]])
> query.maf.hg19 <- GDCquery(project = "TCGA-ESCA", 
+                            data.category = "Simple nucleotide variation", 
+                            data.type = "Simple somatic mutation",
+                            access = "open", 
+                            file.type = "bcgsc.ca_ESCA.IlluminaHiSeq_DNASeq.1.somatic.maf  ",
+                            legacy = TRUE)
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg19
--------------------------------------------
oo Accessing GDC. This might take a while...
--------------------------------------------
ooo Project: TCGA-ESCA
--------------------
oo Filtering results
--------------------
ooo By access
ooo By data.type
ooo By file.type

|Files                                                                  |
|:----------------------------------------------------------------------|
|bcgsc.ca_ESCA.IlluminaHiSeq_DNASeq.1.somatic.maf                       |
|genome.wustl.edu_ESCA.IlluminaHiSeq_DNASeq_automated.1.1.0.somatic.maf |
|gsc_ESCA_pairs.aggregated.capture.tcga.uuid.automated.somatic.maf      |
|hgsc.bcm.edu_ESCA.IlluminaGA_DNASeq.1.somatic.maf                      |
|ucsc.edu_ESCA.IlluminaGA_DNASeq_automated.Level_2.1.0.0.somatic.maf    |
|NA                                                                     |
|NA                                                                     |
|NA                                                                     |
|NA                                                                     |
|NA                                                                     |
Error in GDCquery(project = "TCGA-ESCA", data.category = "Simple nucleotide variation",  : 
  We were not able to filter using this file type. Examples of available files are above. Please check the vignette for possible entries
tiagochst commented 4 years ago

I'll check your code soon. But maybe you want to check this https://gdc.cancer.gov/about-data/publications/mc3-2017 for hg19 mutations.

tiagochst commented 4 years ago

Hi,

> query.maf.hg19 <- GDCquery(project = "TCGA-ESCA", 
+                            data.category = "Simple nucleotide variation", 
+                            data.type = "Simple somatic mutation",
+                            access = "open", 
+                            file.type = "bcgsc.ca_ESCA.IlluminaHiSeq_DNASeq.1.somatic.maf  ",
+                            legacy = TRUE)

Your file.type has empty characters in the end, if you remove them it should work.

query.maf.hg19 <- GDCquery(project = "TCGA-ESCA", 
                           data.category = "Simple nucleotide variation", 
                           data.type = "Simple somatic mutation",
                           access = "open", 
                           file.type = "bcgsc.ca_ESCA.IlluminaHiSeq_DNASeq.1.somatic.maf",
                           legacy = TRUE)
beginner984 commented 4 years ago

Sorry what you mean by file.type has empty characters in the end ?

Thank you

tiagochst commented 4 years ago

Screenshot from 2019-12-09 12-00-25