LTLA / archive-InteractionSet

An archived version of the InteractionSet repository, see https://github.com/LTLA/InteractionSet for the active version.
2 stars 2 forks source link

findInteractions() #5

Closed lawremi closed 8 years ago

lawremi commented 8 years ago

Is there a convenience function for constructing a GInteractions object from a findOverlaps() call? It would be a simple wrapper.

LTLA commented 8 years ago

No such constructor exists at the moment. Would it be a common use case? Forming a GInteractions object out of a findOverlaps call seems a bit odd; you'll only be dealing with very local interactions where the anchor regions are directly overlapping each other. More abstractly, the concept of converting from Hits to a GInteractions object might be useful, but I don't know how many people would store their interaction pairings as a Hits in the first place (and if they did, it would be easy enough to go through the standard constructors using query/subject indices).

lawremi commented 8 years ago

There are many use cases GInteractions. We often have people asking, how do I find the intersection for every overlap between these two sets of ranges? I think GInteractions could give a simple syntax. Have findInteractions() or findOverlapPairs() yield a GInteractions. Then, compute the intersection with pintersect(). That gives:

pintersect(findOverlapInteractions(a, b))

Instead of:

hits <- findOverlaps(a, b)
pintersect(a[queryHits(hits)], b[subjectHits(hits)])

Had to edit the second block of code... error prone.

liz-is commented 8 years ago

I can see the use case for an easy constructor for two parallel GRanges from a Hits object, but I agree with Aaron that this would be a rare use case when dealing with chromatin interaction data.

I think this depends on the scope of the package -- do we want to focus on interaction data, or expand to consider that a GInteractions object could be used to store non-interaction data? If the latter, the class name might not be appropriate...

lawremi commented 8 years ago

The class is definitely generally useful. I have been thinking about a GRangesPairs class for a while. Maybe introduce a parent class named GPairs above GInteractions? Most of the basic functionality would move to that class.

LTLA commented 8 years ago

I agree with Liz, in that I'd prefer to keep GInteractions focused on handling interaction data. Otherwise, users would find themselves taking a detour through "genomic interaction land" even if their analyses had nothing to do with analyzing genomic interactions. While the syntax might be cleaner, I can imagine that there would be some potential for conceptual confusion as to what's going on.

That said, once we work out what to do, I'm open to having code shuffled around, e.g., into a GRangesPairs superclass. Or maybe an explicit wrapper method operating on Hits to identify the union/intersection GRanges would be a more direct solution to the current problem.

lawremi commented 8 years ago

I didn't realize that the package was specific to physical interactions. Was thinking more abstractly. Thinking more deeply about it, I'm not sure GInteractions is exactly what I want, because in general one would want to pair two distinct GRanges, potentially with different mcols() and maybe from different genomes. GInteractions is essentially a graph model, where the nodes are all in the same GRanges. That's too constrained for my use cases. We could add a Pairs object to S4Vectors that just pairs up two Vectors. Via delegation, many operations could be supported without specialization. Closing this; sorry for the noise.