jts / nanopolish

Signal-level algorithms for MinION data
MIT License
550 stars 160 forks source link

methylated_frequency is 1.000 only #1123

Closed sachingadakh closed 5 months ago

sachingadakh commented 5 months ago

Hello Sir I using methylation calling on my 10-13X ONT datasets. I filtered methylaltion calls with log_like_ratio >=2 and >0 and then calculated methylated frequency on filtered methylation calls. I am getting 1.000 for all the sites in the samples. I meant samples where I am having two sample such a daughter and father sample and my goal is perform differential methylation but since methylation frequency is just 1.000 for all the sites in both sample there is no deviation for differential methylation.
I am sharing the screenshot of first 10 rows from filtered methylation calls and methylation frequency of one of the sample. I can share the whole files as well if you prefer image image Thank you

jts commented 5 months ago

Hi,

If you filter log_lik_ratio to only have a positive value then all calls with be methylated. You should filter so the absolute value of log_lik_ratio is greater than your threshold.

Jared

sachingadakh commented 5 months ago

Hello Sir, I am little confuse here. I am sorry for naive questions. Actually I also had kept threshold >=2 (like I did >0) to filter log_like_ratio for both samples in order to calculate frequencies and then differential methylation analysis. In some of the issues here you have suggested >=2 threshold as filter. I still getting 1.000 methylated frquencies for all samples.

jts commented 5 months ago

So, sites with log_lik_ratio > 2.0 are called as methylated. Sites with log_lik_ratio < -2.0 are called as unmethylated. If you threshold so that you only have calls with log_lik_ratio > 2.0 you will only have methylated calls. If you filter so that abs(log_lik_ratio) >= 2.0 then you will have both methylated and unmethylated called.

sachingadakh commented 5 months ago

Yes sir I am aware and agree what you explained. So I have to keep unmethylated calls as well, keeping abs(log_lik_ratio) >= 2.0 as threshold, to calculate methylation frequency and ultimately differential methylation, right !