smithlabcode / methpipe

A pipeline for analyzing DNA methylation data from bisulfite sequencing.
http://smithlabresearch.org/methpipe
66 stars 27 forks source link

hmr fails on sorted CpG's #195

Closed cb4github closed 2 years ago

cb4github commented 2 years ago

Dear All,

Using methpipe version 5.0.0, I'm getting the following result from hmr.

$ hmr -v -o CM1-1.hmr CM1-1.CpG.unique.sorted.formatted.bam.meth
[reading methylation levels]
error: input is not properly sorted: CM1-1.CpG.unique.sorted.formatted.bam.meth

I've tried explicitly sorting the CpG's output from symmetric-cpgs with the same result.

$ LC_ALL=C sort -k 1,1V -k 2,2n -o sorted.CM1-1.CpG.unique.sorted.formatted.bam.meth CM1-1.CpG.unique.sorted.formatted.bam.meth
$ diff sorted.CM1-1.CpG.unique.sorted.formatted.bam.meth CM1-1.CpG.unique.sorted.formatted.bam.meth
$ <diff is empty>
$ hmr -v -o CM1-1.hmr sorted.CM1-1.CpG.unique.sorted.formatted.bam.meth
[reading methylation levels]
error: input is not properly sorted: sorted.CM1-1.CpG.unique.sorted.formatted.bam.meth

Please advise, thanks. Best, CB

guilhermesena1 commented 2 years ago

Hello,

Thank you for sharing the error. We are working on relaxing the condition that chromosome names need to be sorted alphabetically.

That said, can you try sorting it as follows and see if it fixes the issue? Thank you!

$ LC_ALL=C sort -k 1,1 -k 2,2n -k3,3 -o sorted.CM1-1.CpG.unique.sorted.formatted.bam.meth CM1-1.CpG.unique.sorted.formatted.bam.meth
cb4github commented 2 years ago

Dear @guilhermesena1,

Many thanks for your prompt reply. The hmr ran to completion with alphabetically sorted chromosomes.

For future benefit, it would help to update the manual - dated July 26, 2021 at this typing - accordingly, thanks.

Best, CB

moqri commented 2 years ago

Thanks for the hint. Is this condition relaxed in version 5.0.1 ?

guilhermesena1 commented 2 years ago

It should be yes.

Also please note that all of our development of methpipe (including the past handling of this bug) is now on dnmtools