Closed moqri closed 1 year ago
Digging deeper, I think I found the source of the issue:
It seems that the roi command counts the number of CpGs in each region reported in each count file (not from the reference) so if the count file does not have the CpG, roi does not count it for column 10. Maybe just a clarification in the docs would be helpful?
Solution for others who might encounter the same issue when using count data from other sources:
If you are creating you count data using wgbs_tools beta2bed, use "--keep_na" for consistency with dnmtools.
Thanks for this @moqri it is indeed something we would want to clarify. We will also be trying to have safeguards in place for such things.
Describe the bug roi -M return different values for "number of CpGs in the region" for different count files
To Reproduce run roi -M on the same HMR file (with one HMR region) using two different count files
Expected behavior Same columns 10 values (as number of CpGs in a region should only depend on the HMR region, if I understand correctly)
Screenshots
Desktop (please complete the following information): Linux 7
Additional context dnmtools/1.2.2