Open sameer-aryal opened 8 months ago
Hi @sameer-aryal,
Yes, this is possible. To do this, you'd want to replace the tg2
or t2g_3col
file, which is a mapping from transcripts to gene (or transcript to gene + splicing-status) with a corresponding file that maps transcripts to themselves. You can, of course, decide how you want to handle the splicing status in this case (e.g. consider each merged intronic span as a separate transcript, group them all together into a single intronic supertranscript for the gene, etc.).
However, the big caveat here is that while this is easy to to technically, current 3' tag-based protocols are likely not going to be very good at giving you isoform level information reliably. This is because they are sequencing in a strongly-biased way from the 3' end of the transcripts, so, at most, you may be able to distinguish families of transcripts that share different terminal exons. Likewise, the per-cell depth of coverage is very low, so there is not much information to help with resolving ambiguous reads (I'm guessing you'd want to use a UMI resolution method in this case that turns on the EM to help avoid losing too many reads to multimapping).
Anyway, we're happy to help you out if you want to give this a try. I'm pinging @DongzeHE so he can chime in here as well if he wants.
Best, Rob
Dear @rob-p,
Thanks very much for the guidance, as well for creating and maintaining this excellent tool.
…so, at most, you may be able to distinguish families of transcripts that share different terminal exons.
This is exactly the case I wish to use this approach for.
you'd want to replace the
tg2
ort2g_3col
file, which is a mapping from transcripts to gene (or transcript to gene + splicing-status) with a corresponding file that maps transcripts to themselves.
I will give this a try; thanks very much again.
I wanted to ask if it was possible to generate a barcode-by-isoform count matrix (instead of gene-level counts) using simpleaf; thanks very much.