BioinformaticsFMRP / TCGAbiolinks

TCGAbiolinks
http://bioconductor.org/packages/devel/bioc/vignettes/TCGAbiolinks/inst/doc/index.html
287 stars 109 forks source link

query TCGA-OV HTSeq - Counts does not work #148

Closed veroniquevoisin closed 6 years ago

veroniquevoisin commented 6 years ago

Hi, I m trying to download the TCGA-OV HTSeq - Counts using the TCGAbiolinks R package but it does not work:

query <- GDCquery(project = "TCGA-OV", data.category = "Transcriptome Profiling", data.type = "Gene Expression Quantification", workflow.type = "HTSeq - Counts", file.type = "results", legacy = TRUE) )

The error I get is " Error in checkDataCategoriesInput(project, data.category, legacy) : Please set a valid data.category argument from the column data_category above. We could not validade the data.category for project TCGA-OV"

However I m able to download it directly from the GDC data portal website. I m quite sure that "Transcriptome Profiling" is the correct data category.

I would greatly appreciate any insights. Thanks a lot. Veronique

veroniquevoisin commented 6 years ago
screen shot 2017-09-21 at 1 00 56 pm
veroniquevoisin commented 6 years ago
screen shot 2017-09-21 at 12 39 43 pm
tiagochst commented 6 years ago

"Transcriptome Profiling" data category is only available in the harmonized database (legacy = FALSE). The legacy archive (https://portal.gdc.cancer.gov/legacy-archive) has other data categories.

veroniquevoisin commented 6 years ago

thanks!!

On Thu, Sep 21, 2017 at 5:47 PM, Tiago Chedraoui Silva < notifications@github.com> wrote:

"Transcriptome Profiling" data category is only available in the harmonized database (legacy = FALSE). The legacy archive (https://portal.gdc.cancer.gov/legacy-archive) has other data categories.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BioinformaticsFMRP/TCGAbiolinks/issues/148#issuecomment-331291213, or mute the thread https://github.com/notifications/unsubscribe-auth/AHODy2PE_XnnL6d9IJ7mibTa103ycu-vks5sktlbgaJpZM4Pfo1n .

-- Veronique Voisin, PhD

Pathway and Network Analyses for OICR Cancer Stem Cell Research Terrence Donnelly Centre for Cellular and Biomedical Research, University of Toronto, 160 College Street, Toronto, M5S3E1. https://oicr.on.ca/oicr-programs-and-platforms/innovation-programs/cancer-stem-cells/resources https://oicr.on.ca/oicr-programs-and-platforms/innovation-programs/cancer-stem-cells/resources http://baderlab.org/CSCPathwayAnalysisService http://baderlab.org

beginner984 commented 5 years ago

Sorry I put legacy=FALSE or TRUE but I am getting the same error

> query.exp.hg38 <- GDCquery(project = "TCGA-COAD",
                            data.category = "Trascriptome Profiling",
                            data.type = "Gene expression quantification",
                            workflow.type="HTSeq-Counts", legacy = FALSE)
--------------------------------------
o GDCquery: Searching in GDC database
--------------------------------------
Genome of reference: hg38

| case_count| file_count|data_category               |
|----------:|----------:|:---------------------------|
|        459|       2493|Transcriptome Profiling     |
|        460|       1953|Copy Number Variation       |
|        433|       3952|Simple Nucleotide Variation |
|        458|        556|DNA Methylation             |
|        461|        531|Clinical                    |
|        460|       1959|Sequencing Reads            |
|        461|       2835|Biospecimen                 |
Error in checkDataCategoriesInput(project, data.category, legacy) : 
  Please set a valid data.category argument from the column data_category above. We could not validade the data.category for project TCGA-COAD
> 
tiagochst commented 5 years ago

@beginner984 your input is missing a "n" Trascriptome -> Transcriptome