Closed FrAoJm closed 3 years ago
You aren’t specifying the assay name, so it’s providing you the estimated counts (NumReads).
These have priority in tximport / tximeta because these are used in statistical modeling with abundance and length used as an offset.
Thank you, Mike, for the explanation, I am quite new in bioinformatics.... Very helpful. I have to still to understand (digest...) the meaning of the offset... but I will read more about it.
After this, I normalise following the next steps (is this right)?;
# Summarise to Gene-level
gse <- summarizeToGene(se)
And normalise,...
library(edgeR)
y <- makeDGEList(gse)
keep <- filterByExpr(y)
y <- y[keep, ,keep.lib.sizes=FALSE]
y <- calcNormFactors(y)
norm.counts.TMM<- as.data.frame(cpm(y)) #not sure if it is better with log=T, or log=F)
I used to be more familiar with DESeq2, but I have no groups in my dataset I couldn't found how to normalise without adding groups. (if there is a way happy to follow that lead... :) )
Thank you so much for your help!! and the quick response! I have another doubt regarding the use of TPM across samples but probably for another issue XD
Kind regards,
Yes, correct. For support related questions I find it easier to use the Bioc support site: support.bioconductor.org
Most of the GH issues here are feature requests or bug reports.
Thank you, Mike. I will use the Bioc Support site next time, but I really appreciate the quick answer!
Kind regards,
Hi! I am using tximeta to import the abundances from salmon quant files (usually using genecode human transcriptome) and I realise the summarised experiment object has very weird (for my understanding) of the counts. I checked the TPM column on the quant files and they (as expected by TPM nature...) sum 10^6. But:
What are these numbers? am I doing something wrong? shouldn't be them also a million of TPM?
Thanks,