t-neumann / slamdunk

Streamlining SLAM-seq analysis with ultra-high sensitivity
GNU Affero General Public License v3.0
37 stars 22 forks source link

Using DUNK for detection of RNA editing #109

Closed andreagillespie closed 4 months ago

andreagillespie commented 2 years ago

Hi! I am interested in using DUNK for a non-SLAMseq dataset. I have RNAseq data in which I am looking for RNA editing (including hyper-editing which yields heavily edited reads) by ADAR1/2 which manifests as A>G and T>C conversions. As SLAM-DUNK is intended specifically for SLAMseq conversions (T>C only) I am wondering if it would be possible to use DUNK with a conversion aware scoring scheme that does not penalise for either A>G or T>C. I have only been able to find DUNK as part of SLAM-DUNK, so I am wondering if you could kindly point me in the right direction for using DUNK for quantification of non-SLAMseq conversions. Or if there is a way to do this with SLAM-DUNK that would be great. Thanks for your help!

Andrea

t-neumann commented 2 years ago

Hi, so are you saying you would have both A>G and T>C conversions within a single read, therefore the the algorithm should punish neither of those two types?

andreagillespie commented 2 years ago

Thank you so much for the timely response! I would only expect one or the other in a single read. I have been told (by the biologist I am analysing the data for) that I need to look for both A>G and T>C conversions. Actually, though, ADAR only edits A>G and therefore manifests T>C as the reverse complement. But, really, I would only have to look for the reverse complement if the data is unstranded, right? If the library is stranded then the reverse reads would be aligned to the antisense strand anyway and, therefore, the T>C conversions in the reverse complement would be sussed out as well. Does that then apply to conversion aware scoring in the reverse complement? If this is the case and I could just align with A>G conversion only, this is still the opposite of what SLAM-DUNK is doing, correct? Forgive my confusion, hyper-editing is new to me and it seems there is a paucity of software or methods for this type of analysis.

t-neumann commented 2 years ago

Hm so if it is stranded data and in a stranded fashion this should be read out as A>G you could actually reverse complement all your reads (therefore making A>G -> T>C) and then use Slamdunk out of the box without needing to rethink your alignment.

andreagillespie commented 2 years ago

Ah, yeah, okay. That should work for me. I will give that a go. Thanks so much!

andreagillespie commented 2 years ago

I really appreciate your help with this. I do have an additional question on it. This is actually paired-end data I am working with, which I understand means aligning outside of SLAM-DUNK as this is not possible within the SLAM-DUNK framework. So how could I align using NGM with paired reads while still maintaining NGM's SLAMseq alignment settings? (as this is my point in using this pipeline for analysis). I hope that makes sense. Thanks again!

t-neumann commented 2 years ago

You can look up the ngm command with the exact parameters in the mapper.py dunk script - an just replace the regular read input with I think -1 fwd reads -2 reverse reads.... check the NGM documentation for paired-end input to double check this