Aufiero / circRNAprofiler

10 stars 3 forks source link

can this tool support tpm as output? #10

Closed ShixiangWang closed 2 years ago

ShixiangWang commented 2 years ago

Hi

tpm is useful for comparing circRNA between samples,have this tool support such feature?I cannot find in vignette.

Best

Aufiero commented 2 years ago

Hi Shixiang,

CircRNAs are mainly recorded by RNA-Seq and computational methods focused on detecting sequences mapping at the back-splicing junctions only, so you do not need TPM like in gene expression in which you have to normalize for the length of the gene. For circRNAs you only need to normalize your reads by sequencing depth. With circRNAprofiler you can use getDeseqRes() or getEdgerRes() to normalize your counts by the sequencing depth and perform differential expression analysis. With normalized reads, you can the compare expression of circRNAs between samples and check whether a circRNA is more expressed than another.

ShixiangWang commented 2 years ago

@Aufiero Thanks for your kind reply. I haven't any experience in circRNA analysis yet. The functions getDeseqRes() et al you recommended are fitted in DEG analysis. I am wondering if I only want the normalized counts so I can analyze and compare a specified circRNA (gene) in a common way like gene expression across sample groups (or even different datasets) with boxplot and t.test/wilcox.test, how should I normalize the data? I checked several papers and find out they mainly use the concepts in methods but not mention how to calculate it.

ShixiangWang commented 2 years ago

For example, in this paper, https://www.sciencedirect.com/science/article/pii/S2352396417304887, can TPM (and relative TPM) mentioned in this paper be easily implemented in circRNAProfiler or what's your best recommendation for my purpose?

image

Best,

Shixiang

Aufiero commented 2 years ago

The functions getDeseqRes() and getEdgerRes() internally use the R Bioconductor package DESeq2, and EdgeR to normalize your counts before performing differential expression analysis. If you run one of the 2 functions, you get normalized counts plus the statistics from the differential expression analysis.
RNA-Seq datasets expression levels are represented as discrete read counts, so the t.test is not appropriate for differential expression analysis, and different methods need to be used. DESeq2 and EdgeR implement a beta-binomial model to model changes in expression, which is more appropriate for RNA-seq data.

Aufiero commented 2 years ago

For the TPM, you can calculate it for your circRNA and it makes sense if you know all the exons belonging to the circRNA molecule because then you can normalize using the length of the circRNA and not just the back-splice junction. Retrieving the exon composition of a circRNA is not easy; one way is to estimate it.

In circRNAprofiler TPM calculation is not implemented since back-spliced junction reads are modeled using the state-of-the-art tools DESeq2, or EdgeR. If the scope is to check whether a circRNA is differentially expressed, there is no specific need to use TPM.

ShixiangWang commented 2 years ago

@Aufiero Does it sound to directly use the merged gene exon length as the effective length for calculating TPM? I am developing a web tool that a user can check expression of a circRNA between a specified group (may across multiple datasets) with boxplot/violins etc., if you were me, how would you do? To me, it's impractical to do the DEG with edgeR/DESeq2 for just one circRNA, and it's also not sound to just compare the counts.

Thanks in advance :).

Aufiero commented 2 years ago

You could estimate the length of a circRNA as the sum of the lengths of the exons between the back-splice exons and including the back-splice exons, but keep in mind that this estimate might be wrong, especially when the back-splice exons are very far from each other because splicing events can take place and so this changes the number of exons that are going to be part of the final circRNA molecule.

About your project, I can not really help, I would need to know more to be able to say something.

ShixiangWang commented 2 years ago

@Aufiero Thank you very much.