Superbird11 / ranges

Continuous Range, RangeSet, and RangeDict data structures for Python
MIT License
101 stars 10 forks source link

POC for range intersection and isdisjoint #17

Closed root-11 closed 2 years ago

root-11 commented 2 years ago

I took some inspiration from your repo and thought it would be kind to return the favour ;-)

My use case:

I have a lot data in paginated memory blocks. Each page is generally of a block size but can vary due to inserts and deletes. My challenge: Determine whether the contents of a page needs to be loaded to answer a users query for slice.

Simplistic example:

page = range(500,700,1)
user = range(10,1000,30)
userdata = []
for row in intercept(page,user):
      userdata.append(row)

Here is the solution:

def intercept(A,B):
    """
    A: range
    B: range
    returns range as intercept of ranges A and B.
    """
    assert isinstance(A, range)
    if A.step < 0: # turn the range around
        A = range(A.stop, A.start, abs(A.step))
    assert isinstance(B, range)
    if B.step < 0:  # turn the range around
        B = range(B.stop, B.start, abs(B.step))

    boundaries = [A.start, A.stop, B.start, B.stop]
    boundaries.sort()
    a,b,c,d = boundaries
    if [A.start, A.stop] in [[a,b],[c,d]]:
        return range(0) # then there is no intercept
    # else: The inner range (subset) is b,c, limited by the first shared step.
    A_start_steps = math.ceil((b - A.start) / A.step)
    A_start = A_start_steps * A.step + A.start

    B_start_steps = math.ceil((b - B.start) / B.step)
    B_start = B_start_steps * B.step + B.start

    if A.step == 1 or B.step == 1:
        start = max(A_start,B_start)
        step = B.step if A.step==1 else A.step
        end = c
    else:
        intersection = set(range(A_start, c, A.step)).intersection(set(range(B_start, c, B.step)))
        if not intersection:
            return range(0)
        start = min(intersection)
        end = max(c, max(intersection))
        intersection.remove(start)
        step = min(intersection) - start

    return range(start, end, step)

tests

def test_range_intercept():
    A = range(500,700,3)
    B = range(520,700,3)
    C = range(10,1000,30)

    assert intercept(A,C) == range(0)
    assert set(intercept(B,C)) == set(B).intersection(set(C))

    A = range(500_000, 700_000, 1)
    B = range(10, 10_000_000, 1000)

    assert set(intercept(A,B)) == set(A).intersection(set(B))

    A = range(500_000, 700_000, 1)
    B = range(10, 10_000_000, 1)

    assert set(intercept(A,B)) == set(A).intersection(set(B))

Speak later ;-)

root-11 commented 2 years ago

published here in any case: https://github.com/root-11/root-11.github.io/blob/master/content/comparing_ranges.ipynb

Superbird11 commented 2 years ago

Thanks for sharing, but I don't especially see how this is useful for this module.

Range calculations are a lot easier when using continuous ranges as this module provides, as compared to discrete ranges like python's range - all ranges.Range.intersection() and ranges.RangeSet.intersection() need to do is compare endpoints, whereas with discrete ranges you do actually need to find all individual overlapping elements.

root-11 commented 2 years ago

Right. Sorry for not being explicit. My mistake. Let's say we have 10s or 100s of disconnected RangeSets that may or may not intersect with another collection of RangeSets. To answer the problem of whether these sets intersect you can use the code I wrote. Kind regards B

On Sat, 19 Mar 2022 at 02:04, Louis Jacobowitz @.***> wrote:

Thanks for sharing, but I don't especially see how this is useful for this module.

Range calculations are a lot easier when using continuous ranges as this module provides, as compared to discrete ranges like python's range - all ranges.Range.intersection() and ranges.RangeSet.intersection() need to do is compare endpoints, whereas with discrete ranges you do actually need to find all individual overlapping elements.

— Reply to this email directly, view it on GitHub https://github.com/Superbird11/ranges/issues/17#issuecomment-1072918495, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA64MJXBQO6WNFSDC6T64ULVAUY3HANCNFSM5QVO5KHA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

-- Bjorn Madsen, Ph.D., MIET @.*** Ph.: (+44) 0 7792 030 720 Flat 7, 49B Spylaw Street, Colinton, Edinburgh, EH13 0JT