rli012 / GDCRNATools

GDCRNATools: an R/Bioconductor package for integrative analysis of lncRNA, miRNA and mRNA data in GDC
Apache License 2.0
67 stars 41 forks source link

Error in the gdcRNAMerge() #22

Open epfarias opened 2 years ago

epfarias commented 2 years ago

I've downloaded all the data correctly and when i get to the merge moment i recive this error

####### Merge RNAseq data ####### rnaCounts <- gdcRNAMerge(metadata = metaMatrix.RNA, path = rnadir, # the folder in which the data stored organized = FALSE, # if the data are in separate folders data.type = 'RNAseq')

############### Merging RNAseq data ################

This step may take a few minutes

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 2 did not have 9 elements

Anyone could help me with this problem ?

LGROOVE commented 2 years ago

Hi I have the same persistent problem!

epfarias commented 2 years ago

Hi @LGROOVE i've soved this issue downloading the data from Xenabrowser and adapting the informations from dataset to the metamatrix_MIR

LGROOVE commented 2 years ago

Hi there, did you modify the gene expression tsv files?

An-05 commented 2 years ago

I have the same problem. How was the problem solved?

pegasusCN commented 2 years ago

TCGA has changed its data format, so original function gdcRNAMerge() won't work. Try this:

myrnaMerge <- function (metadata, path, data.type, organized = FALSE) { if (organized == TRUE) { filenames <- file.path(path, metadata$file_name, fsep = .Platform$file.sep) } else { filenames <- file.path(path, metadata$file_id, metadata$file_name, fsep = .Platform$file.sep) } if (data.type == "RNAseq") { message("############### Merging RNAseq data ################\n", "### This step may take a few minutes ###\n") rnaMatrix <- do.call("cbind", lapply(filenames, function(fl) read.table(gzfile(fl), skip = 6)$V4)) rownames(rnaMatrix) <- read.table(gzfile(filenames[1]), skip = 6)$V1 rownames(rnaMatrix) <- unlist(lapply(strsplit(rownames(rnaMatrix), ".", fixed = TRUE), function(gene) gene[1])) colnames(rnaMatrix) <- metadata$sample

rnaMatrix <- rnaMatrix[biotype$ensemblID, ]

nSamples = ncol(rnaMatrix) nGenes = nrow(rnaMatrix) message(paste("Number of samples: ", nSamples, "\n", sep = ""), paste("Number of genes: ", nGenes, "\n", sep = "")) return(rnaMatrix) } else if (data.type == "pre-miRNAs") { message("############### Merging pre-miRNAs data ################\n", "### This step may take a few minutes ###\n") rnaMatrix <- do.call("cbind", lapply(filenames, function(fl) read.delim(fl)$read_count)) rownames(rnaMatrix) <- read.delim(filenames[1])$miRNA_ID colnames(rnaMatrix) <- metadata$sample nSamples = ncol(rnaMatrix) nGenes = nrow(rnaMatrix) message(paste("Number of samples: ", nSamples, "\n", sep = ""), paste("Number of genes: ", nGenes, "\n", sep = "")) return(rnaMatrix) } else if (data.type == "miRNAs") { message("############### Merging miRNAs data ###############\n") mirMatrix <- lapply(filenames, function(fl) cleanMirFun(fl)) mirs <- rownames(mirbase) mirMatrix <- do.call("cbind", lapply(mirMatrix, function(expr) expr[mirs])) rownames(mirMatrix) <- mirbase$v21[match(mirs, rownames(mirbase))] colnames(mirMatrix) <- metadata$sample mirMatrix[is.na(mirMatrix)] <- 0 nSamples = ncol(mirMatrix) nGenes = nrow(mirMatrix) message(paste("Number of samples: ", nSamples, "\n", sep = ""), paste("Number of miRNAs: ", nGenes, "\n", sep = "")) return(mirMatrix) } else { return("error !!!") } }

then use this myrnaMerge() to process your data.