tleonardi / nanocompore

RNA modifications detection from Nanopore dRNA-Seq data
https://nanocompore.rna.rocks
GNU General Public License v3.0
78 stars 12 forks source link

failing to reproduce nanocompore IVT results #119

Closed pabloacera closed 4 years ago

pabloacera commented 4 years ago

Hi,

Thank you in advance for your time. I have tried a couple of times to reproduce the results from the paper with the synthetic oligo following this script:

The versions of the software that I am using are:

The results that I am getting from Nanocompore are:

pos     chr     genomicPos      ref_id  strand  ref_kmer        GMM_logit_pvalue        KS_dwell_pvalue KS_intensity_pvalue     GMM_cov_type    GMM_n_clust     cluster_counts  Logit_LOR
0       NA      NA      FLuc_Control_Plasmid:88-1805    NA      TGAGG   2.487108774100814e-11   2.7271975237022288e-14  1.982682853422405e-12   full    2       mod_1:42/14__unmod_1:202/613    2.159938863696
1       NA      NA      FLuc_Control_Plasmid:88-1805    NA      GAGGA   1.860250732411455e-06   0.0003952944224375441   9.396744652719618e-24   full    2       mod_1:53/9__unmod_1:205/228     1.792244788334
2       NA      NA      FLuc_Control_Plasmid:88-1805    NA      AGGAC   0.08592556734605863     0.34470921508863617     2.1883601950690137e-08  full    2       mod_1:2/64__unmod_1:29/202      -1.16376638384
3       NA      NA      FLuc_Control_Plasmid:88-1805    NA      GGACT   0.0009172417666299403   3.456527323646378e-09   1.9133753522255682e-08  full    2       mod_1:7/63__unmod_1:76/148      -1.41930065758
4       NA      NA      FLuc_Control_Plasmid:88-1805    NA      GACTG   0.009288744180066056    0.5232312316939101      5.624091168615292e-10   full    2       mod_1:71/0__unmod_1:170/40      2.848574629217
5       NA      NA      FLuc_Control_Plasmid:88-1805    NA      ACTGT   0.9374624112117623      0.22220553961104514     0.007869173562807382    full    2       mod_1:66/5__unmod_1:160/14      0.039578986280
6       NA      NA      FLuc_Control_Plasmid:88-1805    NA      CTGTA   0.2924158906352946      1.9897300739994197e-05  0.04005633003735989     full    2       mod_1:63/8__unmod_1:137/30      0.468392025351

I am not sure if its a version issue or other stuff that I may be doing wrong. Looking at the cluster counts I have noticed that I have many more unmodified reads than you guys, may I be skipping a filtering step or something?

thanks again, I hope not to cause many troubles. Cheers, Pablo.

tleonardi commented 4 years ago

Hi Pablo, thanks for getting in touch. The results that you get do make sense, but as you noticed the p-values don't match those that I got.. As you said you have way more reads than I did.. my impression is that this is could be due to the more recent versions of Guppy and Napolish that you are using (I used 3.1.5+781ed57 and 0.11.1 respectively). I'll rerun the analysis after updating the software and will let you know.

pabloacera commented 4 years ago

Thank you very much for your quick response!

a-slide commented 4 years ago

Assumed fixed