PacificBiosciences / pbbioconda

PacBio Secondary Analysis Tools on Bioconda. Contains list of PacBio packages available via conda.
BSD 3-Clause Clear License
249 stars 44 forks source link

Certain reads with missing methylation modification tags #704

Open tobybaker opened 1 month ago

tobybaker commented 1 month ago

I am looking at CpG methylation probabilities from the jasmine-processed example data HG002.GRCh38.haplotagged.bam (downloaded from here).

When looking at the reads that overlap a given CpG site, I see that a minority of reads have no methylation ML/MM tag information for any CpGs on the read.

It appears that pb-CpG-tools treats these reads as corresponding to a set of unmethylated CpGs when using the count pileup method, but I am unsure how to treat these without the associated probabilities.

What is the cause of these reads and how should they be handled?

armintoepfer commented 1 month ago
  Empty MM/ML tag            No methylation sites.
  Missing MM/ML tag, np = 0  No kinetic input.
  Missing MM/ML tag, np > 0  Insufficient passes.

That means, we couldn't process the read.

tobybaker commented 1 month ago

Thank you for the clarification.

Could you please also clarify why the reads with a low number of passes and missing ML/MM tags appear to be added to the unmethylated counts when using pb-CpG-tools with count mode?