10XGenomics / cellranger

10x Genomics Single Cell Analysis
https://www.10xgenomics.com/support/software/cell-ranger
Other
340 stars 91 forks source link

Filtering out unspliced UMIs #153

Closed Chhiring-Lama closed 8 months ago

Chhiring-Lama commented 2 years ago

Hi, I want to run cellranger count on gene expression data we have, but only want to capture the spliced transcript. I understand the not using include-introns we simply exclude the introns read. Is there anyway to modify/add arguments to filter out unspliced transcripts?

evolvedmicrobe commented 2 years ago

Hmm, that's pretty advanced, one problem is that it's generally not possible to know if a transcript is fully spliced or not from short read data, as you only observe a portion of the molecule, so a read that completely overlaps an exon could come from a spliced transcript or an unspliced one.

If for some reason you wanted to include only molecules that were definitively spliced, you could generate a reference genome that only included the "splice junctions" so reads that didn't align to these were removed. Perhaps the folks over at the STAR aligner github have some other ideas (that's the aligner cell ranger uses, https://github.com/alexdobin/STAR)