Open kalyanidhusia opened 3 years ago
The function was mainly created for TCGA data. There are certain functions that will only work with TCGA data. You could still run the function as below:
library(TCGAbiolinks)
project <- c('TCGA-PAAD', 'HCMI-CMDC')
clin <- GDCquery_clinic(project, "clinical", save.csv = T)
clin <- GDCquery_clinic(project[1], "clinical", save.csv = T)
proj <- "CMI-MBC"
query <- GDCquery(
project = proj,
data.category = "Transcriptome Profiling",
data.type = "Gene Expression Quantification",
workflow.type = "HTSeq - Counts"
)
GDCdownload(query)
data <- GDCprepare(query)
dataPrep <- TCGAanalyze_Preprocessing(
object = data,
cor.cut = 0.6,
datatype = "HTSeq - Counts"
)
dataNorm <- TCGAanalyze_Normalization(
tabDF = data,
geneInfo = geneInfoHT,
method = "gcContent"
)
dataFilt <- TCGAanalyze_Filtering(
tabDF = dataNorm,
method = "quantile",
qnt.cut = 0.25
)
dataDEGs <- TCGAanalyze_DEA(
mat1 = dataFilt[,which(data$sample_type == "Primary Tumor")],
mat2 = dataFilt[,which(data$sample_type == "Metastatic")],
Cond1type = "Primary Tumor",
Cond2type = "Metastatic",
fdr.cut = 0.01 ,
logFC.cut = 1,
method = "glmLRT",
metadata = FALSE
)
Hi, I am working with the new CMI-MBC data and was using TCGABiolink for DEGs identification.. Now at step
I get the error stating
Error in strsplit(c(colnames(data)), "-") : non-character argument.
I guessed its because of the names of samples MBCProject_0065_T1_RNA_1 and not MBCProject-0065-T1-RNA-1.
Following the logic I tried correcting the sample names by replacing _ with -, but that doesn't help very much.
Any idea what should I do next?