Closed olgabot closed 2 months ago
Whelp, I solved my own problem by actually reading the tximport
documentation for RSEM. In case it helps someone else, here's the code snippet to get the lengths from the RSEM output from nf-core/rnaseq:
library(tximport)
files = Sys.glob('star_rsem/*.genes.results')
names = c()
for (filename in files) { names = c(names, strsplit(strsplit(filename, '/')[[1]][2], '.genes')[[1]][1]) }
names(files) = names
txi.rsem <- tximport(files, type = "rsem", txIn = FALSE, txOut = FALSE)
head(txi.rsem$length)
write.table(txi.rsem$length, 'rsem_gene_lengths.tsv', sep='\t')
Hey Olga, glad you solved the issue! Closing this then 🙂
Description of the bug
Hello, Hope you are doing well! I'm wondering how to take advantage of the transcript length feature (https://github.com/nf-core/differentialabundance/pull/203) when using gene counts created by RSEM. I prefer RSEM as an aligner as I find it to be more specific -- we had some RNA-seq data with plasmids only in certain conditions, and those plasmids got >0 counts in the conditions WITHOUT the plasmids when using salmon :( and didn't have the same issues with RSEM -- but I can't find a
*.gene_lengths.tsv
file created by RSEM.How do you advise creating a
--transcript_length_matrix
file from the RSEM data? Or should we use the TPMs in this case?Thank you!
Command used and terminal output
No response
Relevant files
No response
System information
No response