BioinformaticsFMRP / TCGAbiolinks

TCGAbiolinks
http://bioconductor.org/packages/devel/bioc/vignettes/TCGAbiolinks/inst/doc/index.html
289 stars 110 forks source link

Problem with GDCquery when setting workflow.type="HTSeq - FPKM" #495

Closed N0toriou5 closed 2 years ago

N0toriou5 commented 2 years ago

I used to download TCGA data with GDCquery by setting the workflow.type argument = "HTSeq - FPKM". It is now raising an error by giving the following message:

Error in GDCquery(project = name, data.category = "Transcriptome Profiling", : Please set a valid workflow.type argument from the list below: => STAR - Counts

My complete command:

query <- GDCquery(project = name, data.category = "Transcriptome Profiling", data.type = "Gene Expression Quantification", workflow.type = "HTSeq - FPKM") where name e.g. = "TCGA-BRCA"

My TCGAbiolinks version: 2.23.4

Thanks for your help.

tiagochst commented 2 years ago

@N0toriou5 It seems GDC made a big update: https://docs.gdc.cancer.gov/Data/Release_Notes/Data_Release_Notes/#data-release-320

There is no "HTSeq - FPKM" anymore. I will need to revise the manual and package in the following days.

Screen Shot 2022-03-30 at 1 22 49 PM

I am updating the code, but the one below should work with the latest version:

query <- GDCquery(
    project = "TCGA-BRCA",
    data.category = "Transcriptome Profiling",
    data.type = "Gene Expression Quantification",
    workflow.type = "STAR - Counts"
)

GDCdownload(query, method = "api")
rnaseq <- GDCprepare(query)

You can updated the package with: BiocManager::install("BioinformaticsFMRP/TCGAbiolinks")

N0toriou5 commented 2 years ago

Thanks!

lizakulaeva commented 2 years ago

@tiagochst sorry for interrupting to this question, but the GDCprepare for TARGET projects doesn't seem to work at all (in GUI and package itself) after this STAR-counts update: https://support.bioconductor.org/p/9143546/ How I can fix this problem? Thank you in advance.

tiagochst commented 2 years ago

@tiagochst sorry for interrupting to this question, but the GDCprepare for TARGET projects doesn't seem to work at all (in GUI and package itself) after this STAR-counts update: https://support.bioconductor.org/p/9143546/ How I can fix this problem? Thank you in advance.

I tested with the latest version and it was working. Could you please update TCGAbiolinks with the GitHub version and try again?

haobeny commented 2 years ago

@tiagochst, thanks for the answer of "HTSeq -count", I met another problem during processing the downloaded data by using code: dataPrep1 <- GDCprepare(query = queryDown, save = TRUE, save.filename = "CHOL_case.rda") ERROR: Join columns must be present in data. x Problem with #gene. as I am a new learner for TCGAbiolinks, I don't know how to retrieve the data formation and programing. Could you explain me more detail to solve the problem. appreciation!