al2na / methylKit

R package for DNA methylation analysis
https://bioconductor.org/packages/release/bioc/html/methylKit.html
203 stars 96 forks source link

regioncounts lost some regions #321

Closed leon945945 closed 3 months ago

leon945945 commented 4 months ago

Hi, I took use of regioncounts to calculate methylation degree at some regions:

   seqnames            ranges strand
     <Rle>         <IRanges>  <Rle>

11 V-P1 194583-194983 + 12 V-P1 194587-194987 + 97 V-P1 5826342-5826742 + 147 V-P1 9902204-9902604 + 151 V-P1 10642778-10643178 +

but after regioncounts some regions were lost:

methylBase object with 150 rows

chr start end strand coverage1 numCs1 numTs1 coverage2 numCs2 numTs2 1 V-P1 5826342 5826742 + 547 522 25 378 368 10 2 V-P1 10642778 10643178 + 22 10 12 40 1 39

As you can see, the first, second and fourth region were lost in methylBase object. I want to know what is the probable reason, and can the lost regions be restored? Thanks very much.

alexg9010 commented 4 months ago

HI @leon945945 ,

regionCounts will drop regions not overlapping with any site from your methylBase object.

You could try to figure out which regions are not kept by converting the summarized methylBase to GRanges and comparing it to your original GRanges with setdiff.

Best, Alex

leon945945 commented 3 months ago

Thank you for your reply.

I have another question about calculating methylation precentage of one region.

Let's say there is a 100bp region, each nucleotide has coverage 4, the region has 10 Cytosine sites, and 3 of which have 3,2,4 reads were methylated respectively. Then the methylation percentage is sum(3,2,4)/(4*10) or sum(3,2,4)/(4*3) ? Or other calculation methods?

Hope for your reply.

alexg9010 commented 3 months ago

You can actually check the code. It will be the first calculation, taking the sum of all methylated reads divided by all reads for that region.

Best, Alex

leon945945 commented 3 months ago

Thanks, the code is clear.