Open netphantom opened 4 years ago
query.exp <- GDCquery(
project = "TCGA-BRCA",
legacy = FALSE,
data.category = "Transcriptome Profiling",
data.type = "Gene Expression Quantification",
experimental.strategy = "RNA-Seq",
workflow.type = 'HTSeq - FPKM-UQ',
sample.type = c("Blood Derived Normal"))
But in the TCGA website, I can find 992 blood-derived normal cases, 1077 primary tumor, and 163 solid tissue normal.
Update The reason is here the BRCA just does not have blood derived normal RNA-seq. I zoomed in the TCGA website to check some case. The RNA-seq is just primary tumor/solid normal. The filter at TCGA is confusing to some extent.
@netphantom Could you change the issue title to sth like GDCquery did not return "Blood Derived Normal" RNA-seq data that are available at TCGA so that the author instantly knows what we are asking.
Hi @Puriney You're right, on some dataset GDCQuery returns correctly some values, while on others no. BRCA is one of the most complete, so I don't have problems with that. I was wondering why (in particular with blood datasets) sometimes it doesn't work.
I have the same problem for the LAML datasets. Have you found a way to solve this? It seems the same problem is also true for TARGET project
for some projects, such as LAML, you can download partially the data from GDCQuery and the other part from Recount2, then "join" on the patients barcodes... at least that's what I'm doing
Hi there, I'd like to know why, for some projects (such as TCGA-DLBC), the query returns only "primary tumor" sample types, despite on TCGA it is also present "blood derived normal". I notice that this happens also on many other datasets.
I don't know if this is a bug, but I post here the query:
I'm using R 4.0, and TCGABiolinks 2.16.
Thanks!