GoekeLab / bambu

Reference-guided transcript discovery and quantification for long read RNA-Seq data
GNU General Public License v3.0
189 stars 24 forks source link

Quantification of Transposable elements and/or chimeric TE-gene splice variants #427

Closed Kiliankleemann closed 5 months ago

Kiliankleemann commented 5 months ago

Dear Bambu team,

thank you making such an amazing tool available!! I was wondering if you have come across and/or thought about how bambu could be implemented to quantify the transcription of transposable elements?

I would greatly appreciate hearing your thoughts and suggestions.

Kilian

andredsim commented 5 months ago

Hi Kilian,

Yes Bambu should be able to be used to identify and quantify transposable elements. In our paper we used Bambu to quantify isoforms from repetitive HERVH-LTR7 retrotransposons in human embryonic stem cells. Some points to consider are if you are looking for non-spliced TEs you will need to set opt.discovery = list(min.txScore.singleExon = 0) or some value < 1 to ensure bambu will report them. Also Bambu uses which ever alignment is provided and ignores secondary alignments, so if there are 2 very highly similiar TE's in your sample, the aligner may make mistakes there.

Hope this helps, If you have further questions feel free to ask, but I will close this for now.

Kind Regards, Andre Sim

Kiliankleemann commented 5 months ago

Hi Andre, thank you so much for the info! I will check it out. Regarding the Secondary alignments: You're saying that multi-mappers will not be account for? Kind regards Kilian

andredsim commented 5 months ago

Hi Kilian,

Correct, bambu will use the primary alignment which should generally be the highest scored alignment across multi-mapping possibilities. Supplementary alignments are still used and are treated as separate reads (but this is of less relevance for your use case I imagine)

Kind Regards, Andre Sim