PacificBiosciences / MethBat

A battery of methylation tools for PacBio HiFi reads
BSD 3-Clause Clear License
26 stars 0 forks source link

input regions #9

Closed wittney-m closed 3 months ago

wittney-m commented 3 months ago

methbat profile \ --input-prefix {IN_PREFIX} \ --input-regions {IN_REGIONS} \ --output-region-profile {OUT_PROFILE}

How would I get the input regions for a non-model organism not available on UCSC browser tables?

holtjma commented 3 months ago

Hello,

You can provide any regions you want to the tool and it will extract those regions accordingly. Given that you don't have defined regions of interest, there are two options I would probably recommend:

  1. Attempt to create your own CpG island set - I believe UCSC has guidance on how the model organism sets were created. I do not remember the details, but I believe it looked at CpG density over a window, likely with some other heuristics. As long as you have a reference genome, you should be able to create your own set following that same approach. You may also have other organism-specific filters/approaches to define these that I'm unaware of.
  2. Take an agnostic approach
    • You could generate windows/tiles across the whole genome of some fixed size and then give that to methbat profile. These would not necessarily be tied to a genomic feature, but you could process afterwards to find windows/tiles of interest.
    • You could also use segmentation instead via methbat segment. This would just tell you which regions were (un)methylated without pre-defined regions.

Hope this helps! Matt

holtjma commented 3 months ago

Closing this for now, feel free to re-open if this did not resolve your issue.