tleonardi / nanocompore

RNA modifications detection from Nanopore dRNA-Seq data
https://nanocompore.rna.rocks
GNU General Public License v3.0
80 stars 12 forks source link

ref kmer and position of modification #230

Open keenhl opened 8 months ago

keenhl commented 8 months ago

If a significant change in methylation is detected, at which base pair is this change? Is it the position indicated by the pos column or perhaps the middle of the 5-bp kmer?

lmulroney commented 7 months ago

By default, the position reported in the pos column refers to the first nucleotide of the kmer. You can adjust which position relative to the kmer is reported by changing the context parameter. For RNA002 data, 0 refers to the first position of the kmer (default behavior), 2 would refer to the middle position in the kmer, and 4 would be the final position of the kmer, and everything in between.

g-s-2018 commented 4 months ago

Hi, please can you clarify more and share the command to change the position? is it this --sequence_context {0,1,2,3,4}]

what does 3 means?

Is the red circle what you mean by position? I want to detect the difference in charge intensity between nucleoside analogues and natural nucleosides. Specifically, I'd like to examine the intensity of the nucleotide at the central positions. Is this possible using a nanocompore?? Thanks in advance

Screenshot 2024-07-08 at 21 58 01
lmulroney commented 4 months ago

Hi @g-s-2018,

Because of the nature of the nanopore, 5 (RNA002) or 9 (RNA004) nucleotides are in the sensitive region of the nanopore at any given moment. The ionic current for a particular position is the result of multiple nucleotides occupying the sensitive region of the nanopore which we refer to as a kmer. We can assign to ionic current from a kmer to a particular position in several ways.

Generally (but not always), the nucleotides exert the largest impact on the ionic current when they are in the central position of the kmer, which is closest to the most sensitive region of the nanopore. So we can model this by converting the physical properties of the kmer into a single position based on the index value of the kmer. When 0 is used (default behavior), this means that we are using the position of the first nucleotide in the kmer to assign all the ionic current for the entire kmer. This usually means that the actual modified nucleotide is 2 nucleotides downstream of the reference position for the modification.

So in your example GAGGT, if you used the default parameters for nanocompore, the most probable modified nucleotide is the central G, but depending on the nature of the modification it can be off by a position or two.

When using the default parameters, I recommend looking at the ionic current for the kmers 2 positions upstream through 3 positions downstream. This will represent all the kmers where the modified kmer is likely to pass through the pore and you should see an effect in at least one of those, usually more.

If the nucleoside analogs are randomly incorporated throughout the strand, nanocompore is not the ideal tool to solve this problem, because it relies on several molecules in the same position to all have differences in their ionic current profiles compared to the control sample. If the nucleosides are randomly dispersed, the signal will be rather difficult for nanocompore to detect. But if the nucleoside analogs are incorporated at a known or the same position, then nanocompore should be able to detect the ionic current differences.

If you're mainly interested in the central position of the kmer, then I recommend changing the --context parameter to 2 instead of 0. This will assign the ionic current for a kmer to the position of the central nucleotide (RNA002) instead of the first position.

We're working on a nanocompore update to RNA004, but it is going slower than anticipated.

I hope that this helps, and if anything is unclear don't hesitate to ask.

Cheers, Logan