Open mkpython3 opened 3 years ago
Hi,
Sorry for the delay in reply. NA implies that the status (hyper/hypo) of the DMRs could not be interpreted. I would suggest visualisation of few DMRs. But, if all the status is "NA" I would say it does not have enough power to identify the DMRs with confidence.
Thanks, Akanksha
Hey, thank you for your response.
So do I understand it correctly: HOME first discovers a potential DMR in that region. But then it does not find enough confidence to report it. The DMR will be written to the output file anyways but with all columns NA. I just think it is a little weird since it only happened in the CHH context and a confidence score is still reported. Maybe the DMR should not appear in the output at all then?
Best regards Marius
Hi Marius,
Yes, so for time series data, HOME will try to find the difference between all the samples and will report the region of difference and its confidence (delta). It then checks for a consistent delta but is same direction (hyper/hypo) for possible pairwise comparisons. If its not able to find that it will report NA. You are right its a bit weird to get 'NAs' in all the samples. I guess its just telling you that the data does not have enough coverage for CHH methylation identification. Have to tried to visualise these DMRs on IGV or UCSC browser?
Thanks, Akanksha
Hey Akanksha
I have visualized a few DMRs now using a Python script. For some of the "All NA" DMRs it is quite obvious why they are NA because there is rarely any data in the range of the DMR. But for others i think there should be enough data that coverage should not be an issue. I have attached a few plots below. Here the size of the dots represents the coverage and the color (blue to pink) the level of methylation.
Thanks alot for your help! Marius
Hi Marius,
Thanks for sharing the plots. Its hard to visualise it like this. I think it will better to have a IGV or a UCSC plot in the standard format which will also show the direction of methylation. Please have a look at the HOME paper for reference.
Thanks, Akanksha
Hey Akanksha,
I managed to visualize the methylation levels in IGV. However, I don't know how to reproduce the plots with the methylation difference like in the HOME paper. And btw, what do you mean with "direction of methylation". Do you include an extra track with the DMR information? Because the methylation files are only 0 - 1 and not -1 - 1. Or is there an option in IGV to produce a methylation difference track from the others?
Best regards, Marius
Hi Marius,
You can upload the wig files from BSseeker2 for the samples and visualise the DMRs on IGV. I think this should give us the idea about what's the issue.
Thanks, Akanksha
Hey Akanksha,
thank you for your response. I am using Bismark for my analysis and merged the + and - strand for all chromosomes, therefore i have bedgraph files that go from 0 to 1 instead of wig files that go from -1 to 1. Anyways it should not make a huge difference. I converted those bedgraph files to .tdf files for igv and made a screenshot of a 100% NA DMR. I hope this helps?
Best regards, Marius
Hey, i am using HOME-timeseries with 23 Samples (no replicates) and i just noticed that most of my DMRs that got reported by HOME have "NA" in every Comb1-n column, even though max_delta and confidence_scores are reported.
An example looks like this:
Is this expected behavior? How can this happen?
I am happy to provide more information if necessary. In the meantime i will check if this happened also for other Contexts. The input files are about 273M (uncompressed), so let me know how to submit them if they are needed. Thanks alot in advance.