chaimleib / intervaltree

A mutable, self-balancing interval tree. Queries may be by point, by range overlap, or by range containment.
Apache License 2.0
636 stars 108 forks source link

Finding all intervals of a certain length #95

Open kdorsel opened 4 years ago

kdorsel commented 4 years ago

Is there a way to find all intervals that are at least x in length. For example t.findIntervals(minLength=None, maxLength=None)

t = IntervalTree()
t[0:10] = {}
t[20:25] = {}
t[30:50] = {}
t[60:71] = {}
t.findIntervals(15) => [Interval(30, 50)]
t.findIntervals(10) => [Interval(0, 10), Interval(30, 50), Interval(60, 71)]
t.findIntervals(10, 10) => [Interval(0, 10)]
t.findIntervals(10, 15) => [Interval(0, 10), Interval(60, 71)]
kdorsel commented 4 years ago

After some looking around I found I'm able to get using the built in filter. For my use case, I'd sacrifice memory for quick lookup of Interval length. I will play around with this some more.

def findInterval(tree, minLength=None, maxLength=None):
    if minLength and maxLength:
        l = lambda x: minLength <= x.length() <= maxLength
    elif minLength:
        l = lambda x: minLength <= x.length()
    elif maxLength:
        l = lambda x: x.length() <= maxLength
    else:
        raise('')
    return list(filter(l, tree))
chaimleib commented 4 years ago

Every data structure has pros and cons; there is no single data structure that can answer all possible queries in optimum time. Unfortunately, intervaltree cannot answer your length filter query in faster than linear time. You might want to use a sorted list to get answers in log(n) time. Or if linear time is ok for you, a comprehension like this can help save memory:

[i for i in t if minLength <= i.length() <= maxLength]