mhammell-laboratory / TEtranscripts

A package for including transposable elements in differential enrichment analysis of sequencing datasets.
http://hammelllab.labsites.cshl.edu/software/#TEtranscripts
GNU General Public License v3.0
206 stars 29 forks source link

single cell RNA-seq #148

Closed zhongguozhiwang closed 8 months ago

zhongguozhiwang commented 10 months ago

Hi, Thanks to the author for developing such a great tool. I want to analyze transposons in single-cell transcriptome data. Is this tool suitable for single-cell transcriptome data? Thank you!

olivertam commented 10 months ago

Hi,

Thank you for your interest in the software. Unfortunately, TEtranscripts cannot handle single-cell datasets, as it lacks the ability to work with cell barcode and UMI. A software that could handle single-cell data is currently in development.

Thanks.

zhongguozhiwang commented 10 months ago

Thank you for your prompt reply, it would be great to work with single cell transcriptome data! Thanks.

bkutlu commented 9 months ago

Would using the --SoloMultiMappers EM work?

Below is the excerpt from the STARsolo preprint, that actually cite your 2015 Bioinformatics paper.

--soloMultiMappers EM uses Maximum Likelihood Estimation (MLE) to distribute multi-gene UMIs among their genes, taking into account other UMIs (both unique- and multi-gene) from the same cell (i.e. with the same CB). Expectation-Maximization (EM) algorithm is used to find the gene expression values that maximize the likelihood function. Recovering multi-gene reads via MLE-EM model was previously used to quantify transposable elements in bulk RNA-seq [18] and in scRNA-seq [7, 8].

olivertam commented 9 months ago

Hi,

The methodology does theoretically work, but requires a complicated rebuilding of the index with the TE GTF. In our limited experience (we are still testing the software), we don't see the EM improving the quantification. With single cell data, you tend to encounter situations where you have to assign either 0 or 1 UMI to a particular TE locus, and the EM algorithm is not optimized to make those kinds of decisions.

Thanks.

github-actions[bot] commented 8 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days