tleonardi / nanocompore

RNA modifications detection from Nanopore dRNA-Seq data
https://nanocompore.rna.rocks
GNU General Public License v3.0
78 stars 12 forks source link

A question about the result #217

Open Tomcxf opened 1 year ago

Tomcxf commented 1 year ago

Hi! Thanks for your nanocompore to detect direct RNA modification. But I have some question about the output file. I got the normal output like:

pos chr genomicPos  ref_id  strand  ref_kmer    GMM_logit_pvalue    KS_dwell_pvalue KS_intensity_pvalue GMM_cov_type    GMM_n_clust cluster_counts  Logit_LOR
873 NA  NA  TraesCS1A03G0224700.1   NA  GGATG   0.9636307191264709  0.9999969564924992  0.7237053561013628  full    2   WT_1:29/1__WT_2:40/0__mutant_1:41/0__mutant_2:31/0  -0.7351113796589768
894 NA  NA  TraesCS1A03G0224700.1   NA  TGCCA   nan 0.9999969564924992  0.6230859296888032  full    1   NC  NC
895 NA  NA  TraesCS1A03G0224700.1   NA  GCCAA   0.9724399160938189  0.9999969564924992  0.9531035022795072  full    2   WT_1:30/0__WT_2:42/0__mutant_1:45/1__mutant_2:33/0  0.6141587692413149
896 NA  NA  TraesCS1A03G0224700.1   NA  CCAAG   0.9250138233257315  0.9999969564924992  0.3286551634689277  full    2   WT_1:15/15__WT_2:30/13__mutant_1:29/17__mutant_2:27/6   -0.40365187098398375
897 NA  NA  TraesCS1A03G0224700.1   NA  CAAGG   nan 0.5916486177990066  0.5183734953514848  full    1   NC  NC
898 NA  NA  TraesCS1A03G0224700.1   NA  AAGGC   0.9079812410280157  0.9999969564924992  0.9964425319972754  full    2   WT_1:10/20__WT_2:15/27__mutant_1:9/38__mutant_2:11/22   0.4532469535634805
901 NA  NA  TraesCS1A03G0224700.1   NA  GCCTG   nan 0.9459273492344533  0.4516958290503259  full    1   NC  NC
908 NA  NA  TraesCS1A03G0224700.1   NA  TCTGA   0.9823659884224117  0.9999969564924992  0.5079364659616982  full    2   WT_1:3/27__WT_2:3/40__mutant_1:3/44__mutant_2:5/29  -0.1667570402528384

I just wonder what's the meaning of pos. I check the transcriptome file and find its the relative location. So is the pos start from 0, like python? I don't know whether it means the start of motif or the base's location of modification. And how can I make sure whether modification is more reasonable? Nanocompore aims to detect modification, not a exact kind of modification. But I want to classifiy them (like m6A, 5mC). Is there any solution? Thanks!

lmulroney commented 1 year ago

Hi @Tomcxf,

If you're using the default context settings, then pos is defined as the position of the first nucleotide in the kmer (ref_kmer column in the results tsv), using 0-based reference coordinates. If you change context to say, 2, then pos is defined as the pos of the middle nucleotide in the kmer, using 0-based reference coordinates.

As to your second question, it really depends on what experimental conditions you used for your reference condition. If the sample you compared against is a METTL3 KO/KD, like what was done in the manuscript, then you would find METTL3-dependent RNA modifications (m6A). And all other modifications will not be detected. If instead you KO/KD NUSUN2 or NSUN6, then you would find m5C and not any other modification.

I hope this helps, Logan