GoekeLab / xpore

Identification of differential RNA modifications from nanopore direct RNA sequencing
https://xpore.readthedocs.io/
MIT License
134 stars 22 forks source link

results interpretation #30

Closed Huanle closed 11 months ago

Huanle commented 4 years ago

Hi @ploy-np ,

Can you give more details on how to understand the final results in diffmod.table?

ploy-np commented 4 years ago

Hi @Huanle, We have been currently updating the documentation. Hopefully, more information is coming soon. At the meantime, you can refer to our preprint (https://www.biorxiv.org/content/10.1101/2020.06.18.160010v1).

lingolingolin commented 3 years ago

Hi @ploy-np ,

I have the same question. Your great software has been out for quite a while. But it would be nice to have a detailed documentation of the outputs, e.g. those from diffmod.table. I can barely find clues from your manuscript to understand the columns from diffmod.table. I Look forward to a detailed explanation of those columns.
Thanks a lot in advance.

ploy-np commented 3 years ago

Hi @Huanle and @lingolingolin,

Sorry for the delay. Here is the short description of the diffmod.table. Hope this helps. You will be able to find it in https://xpore.readthedocs.io/en/latest/ soon.

Short description of the xPore output (diffmod.table)

id -- transcript or gene id position -- transcript or gene position kmer -- 5-mer where modified base sits in the middle if modified diff_modrate\<condition1>_vs_\<condition2> -- differential modification rate between condition1 and condition2 (modification rate of condition1 - modification rate of condition2) zscore\<condition1>_vs_\<condition2> -- z score obtained from z-test of the differential modification rate pval\<condition1>_vs\\<condition2> -- significance level from z-test of the differential modification rate modrate\-\<replicate> -- modification rate of a replicate in the condition mu_unmod -- inferred mean of the unmodified RNAs distribution mu_mod -- inferred mean of the modified RNAs distribution sigma2_unmod -- inferred sigma^2 of the unmodified RNAs distribution sigma2_mod -- inferred sigma^2 of the modified RNAs distribution conf_mu_unmod -- confidence level of mu_unmod compared to the unmodified reference signal conf_mu_mod -- confidence level of mu_unmod compared to the unmodified reference signal mod_assignment -- lower if mu_mod < mu_unmod and higher if mu_mod > mu_unmod

lingolingolin commented 3 years ago

Thanks a lot @ploy-np for your prompt reply.

so if a site meets the conditions below:

z_score_unmodified_sample - z_score_modified_sample <0 
pval_unmodified_sample_vs_modified_sample <=0.05

Will it be a positive detection in the modified sample? Was mu_mod/mu_unmod computed with reads from both conditions? By differential modification rate, do you mean modification rate differ between conditions/samples? In your paper, it seems reads from both conditions/samples were mixed. Right? Was t-test (up to choice from program flags) also done with comparing modification rates between conditions/samples?

Thanks very much!

ploy-np commented 3 years ago

Hi @lingolingolin,

Based on the two criteria you gave (z-score and pval), it should be a positive detection. Another criteria you can use to filtering only a single type of modification is to use mod_assignment. You can count how many sites have the same direction of modified signal (lower or higher). For example, if GGACT has 90% of the sites lower, you can filter those higher GGACT sites out if you are interested in only a single modification type of GGACT i.e. m6A.

You understand the definition of differential modification rate correctly. However, as described in the paper, reads from both conditions/samples are not mixed. It is just xpore infer the unmodified and modified distributions that are mathematically shared among all conditions/samples while inferring sample-specific modification rates.

Which t-test do you mean?

lingolingolin commented 3 years ago

Hi @ploy-np ,

Thanks for your detailed explanation. by t-test, i mean what you have in your demo yml file:

method:
    prefiltering:
        method: t-test
        threshold: 0.1
ploy-np commented 3 years ago

Hi @lingolingolin,

With this config, t-test will be performed at the prefiltering step, that is, xpore will not model those positions where intensity mean between conditions are not different based on the t-test in order to speed up the overall process.

Without this config, xpore will model every position.

Please note that the results in our preprint we applied xpore without this pre-filtering step.

lingolingolin commented 3 years ago

thanks a lot @ploy-np. this is truly helpful.