LTLA / InteractionSet

Clone of the Bioconductor repository for the InteractionSet package, see https://bioconductor.org/packages/devel/bioc/html/InteractionSet.html for the official development version.
2 stars 0 forks source link

Reduce() equivalent in InteractionSet? #7

Closed rdali closed 5 years ago

rdali commented 5 years ago

Hello,

I would like to merge interactionSet rows that are overlapping to get a minimum number of interacting pairs; essentially, the equivalent of the "reduce()" function in GenomicRanges. Is there a way to do this for InteractionSets?

   seqnames1   ranges1     seqnames2   ranges2
       <Rle> <IRanges>         <Rle> <IRanges>

[1] chrB [10, 15] --- chrA [20, 25] [2] chrB [10, 28] --- chrA [15, 25] [3] chrA [ 9, 18] --- chrA [77, 94]

Would result in: seqnames1 ranges1 seqnames2 ranges2

[1] chrB [10, 28] --- chrA [15, 25] [2] chrA [ 9, 18] --- chrA [77, 94] Thanks!
LTLA commented 5 years ago

Yes. Not as easily as one might expect, because it's a bit tricky to think in two-dimensional terms. But it can be done without too many tears. Let's set up an example:

library(InteractionSet)
gr1 <- GRanges(c("chrB:10-15", "chrB:10-28", "chrA:9-18"))
gr2 <- GRanges(c("chrA:20-25", "chrA:15-25", "chrA:77-94"))
gi <- GInteractions(gr1, gr2)

Now to do it - the purpose of each step is left as an exercise for the reader:

olap <- findOverlaps(gi)
edges <- as.vector(t(as.matrix(olap)))
g <- igraph::make_graph(edges)
comp <- igraph::components(g)$membership
boundingBox(gi, comp)

I suppose I could formally add this to InteractionSet somewhere, but reduce() by itself seems a bit useless without also getting comp to tell you which entries of gi go into the reduced interactions.

In any case, I have big plans for the entire Hi-C-related infrastructure, so it will have to wait.

rdali commented 5 years ago

Thanks Aaron! I would not have cracked it alone. Certainly useful to wrap up as a function at some point and add to IntersectionSet.