Closed mmiladi closed 4 years ago
Hi @mmiladi,
A single replicate will not give you reliable results. The data is very noisy, and replicating your experiment is the best way to get rid of a lot of this variance.
knowing that if you wish to proceed anyway, in our hands the best method for m6A is the GMM logit with a sequence context of 2 (--logit --sequence_context 2
).
Now, I don't know which mod you are interested in, and it could be that other settings will be better for you. It is hard to know in advance to be honest.
Best Ad
Hi @a-slide ,
Thanks for the feedback. I'll try to follow in line of the suggested parameters, specially my current sequence-context of zero should be important to be changed. I am not targeting a specific methylation at this phase and would like to identify potential sites of any modification.
Regarding the p-value metrics, is there a (strong) dependency to the read coverage depth? By looking into the p-values (GMM_logit, KS_dwell and KS_intensity) alongside the coverage graph in my data, it "feels" like that high p-values are reported once there is a drop in the IVT coverage. Of course it could be a true signal at the transcript 3'/5'-ends or a systematic bias of the ONT data.
Best, M
Yes, there is a dependency to read coverage particularly for the KS methods but the GMM method is doing a better job. We benchmark Nanocompore against other tools including Tombo, and nanocompore has (by far) a much better control of pvalue inflation. We have been thinking of "correcting" pValues by the read coverage, but it feels wonky and we have not find a proper statistical method to do it so far. Any suggestions ? As a compromise, we might include a peak calling method in the next release of Nanocompore to de-noise and refine the positions of mods (see https://github.com/tleonardi/nanocompore/issues/95)
Hi,
I have data for two conditions with only one replicate for each, using sampcomp. Which test would you recommend for the comparison? Also based on your experience, which of the metrics could provide more reliable results? And as the last question, would you recommend to change the default settings for identifying the modified sites? specifically any of the
Statistical testing options
arguments (--comparison_methods
,--sequence_context
,--sequence_context_weights
and--logit
).Thanks and best, Milad