Closed yplee614 closed 1 year ago
Hi @yplee614,
there is no function to merge dmcs to dmrs within a certain range, but you could look at the function described here: https://github.com/al2na/methylKit/issues/193 , which clusters dmcs into dmrs by the same unsupervised segmentation algorithm we use for methSeg().
Best, Alex
Hi Alex,
Could you add this function into the methykit package? Many thanks.
Best, yplee
I mean to merge two DMRs into one DMR, when they are with 100bp in genome. No merge dmcs to dmrs. Thank you.
Ah Okay,
there is also no function to merge dmrs within a certain range, but I guess one could do this with findOverlaps() and some checks to only merge dmrs with similar meth.diff .
@katwre do you maybe already have a function to do this?
hi @yplee614 and @alexg9010 , Sorry, I don't have a function for that. What I would do is:
your.granges.object=as(yourmethdiffobject, "GRanges")
reduce
function with min.gapwidth=100 to merge your DMRs that are in 100 bp proximity (my.red=reduce(your.granges.object, min.gapwidth=100)
)meth.diff
column) or other meta columns that you are interested in from your DMRs that got merged in my.red
, first use findOverlaps(your.granges.object, my.red)
and then stats::aggregate
function (or any other function to aggregate a column by a group from e.g. dplyr or data.table R packages)Hope it helps, Kasia
hi @yplee614 and @alexg9010 , Sorry, I don't have a function for that. What I would do is:
- convert your methylDiff object to GRanges
your.granges.object=as(yourmethdiffobject, "GRanges")
- use
reduce
function with min.gapwidth=100 to merge your DMRs that are in 100 bp proximity (my.red=reduce(your.granges.object, min.gapwidth=100)
)- if you'd like to aggregate average methylation differences (
meth.diff
column) or other meta columns that you are interested in from your DMRs that got merged inmy.red
, first usefindOverlaps(your.granges.object, my.red)
and thenstats::aggregate
function (or any other function to aggregate a column by a group from e.g. dplyr or data.table R packages)Hope it helps, Kasia
Thank you for your great contribution.
I have tried the method you metioned above. However, I got the my.red
without the information of differential methylation which is contained in your.granges.object
. Could you please tell me how to solve it. Thank you very much!
Hi @Biomamba,
Kasia's third point outlines how to get this done:
overlaps <- findOverlaps(your.granges.object, my.red)
signal <- your.granges.object[overlaps@queryHits]
my.red$mean.meth.diff <- aggregate(your.granges.object$meth.diff, list(overlaps@subjectHits), mean)
Best, Alex
If this help, I created a R function that merged overlapping DMRs and average methyl.diff
values to return a tibble in bed format. You can also tweak it to get a GRanges object at the end (just add the methyl.diff value in a metacolumn).
merge_DMRs <- function(methylDiff_object){
require("valr")
require("tidyr")
require("GenomicRanges")
# Convert methylDiff into GRange
gr <- as(methylDiff_object, "GRanges")
# Convert GR to dataframe in bed format
# Rename seqnames to chrom and convert to char otherwise error from valr
# Note Grange is 1-based and bed is 0-based so one needs to subtract 1 to start position
df <- data.frame(chrom=as.character(seqnames(gr)),
start=as.integer(start(gr)-1),
end=as.integer(end(gr)),
name=c(rep(".", length(gr))),
score=as.double(gr$meth.diff))
# Merge first DMRs and keep average methyl.diff value.
df_merged <- df %>% valr::bed_merge(score=mean(score))
# Now add name to each DMR
len <- dim(df_merged)[[1]]
name_DMRs <- paste("DMR", seq(1, len, 1), sep="")
df_merged$name <- name_DMRs
# Reorder columns
df_merged_ordered <- df_merged %>%
dplyr::select(chrom, start, end, name, score)
return(df_merged_ordered)
}
How to merge the DMR within 100bp or other size?