Open jeffwu78 opened 2 years ago
The datafunc
exists to return what the data
field of the resulting interval(s) should be. The result interval has not been finalized yet at the time the datafunc
is called, and so it is not provided. However, its bounds can be determined by combining the arguments given with knowledge of the boundary values provided to chop
(via lexical scoping).
iv
as provided in your quoted snippet is one of the original intervals that are getting sliced, in case this is needed to build the return value.
islower
is a boolean that says whether the resulting interval is less than lower bound provided to chop
. If so, it is True
; otherwise, islower
is False
and the resulting interval is greater than the upper bound provided to chop
. In the snippet, I used it to help select the boundary value of iv
that will be obliterated in the chop in oldlimit = iv[islower]
.
Note that if the chop
boundaries both lie within an interval to be chopped, datafunc
will be called twice with the same iv
: once with islower = True
and once with islower = False
.
The interval data field in my application is a time series, and in my datafunc function, I want to cut out the part of the time series not belonging to chop interval. The problem is the datafunc function does not have access to the endpoints of the chop interval needed to do this.
Although now that I think of it, there is a work-around to the problem. I have an "extended" datafunc function also takes in the chop interval endpoints, and then use a lambda function to pass in these endpoints and create the desired datafunc for the chop function.
Found this when I had the same question for .slice()
, so I'm spelling out the above solutions by @chaimleib and @jeffwu78 which work the same way for .chop()
by integrating the second parameter.
Problem
def datafunc(sliced_interval, islower):
a, b, _ = sliced_interval
if islower:
return f"new data = remove ['point', {b}) from [{a}, {b})"
return f"new data = remove [{a}, 'point') from [{a}, {b})"
t = IntervalTree([Interval(0, 10), Interval(5, 15)])
t.slice(6, datafunc)
```python [Interval(0, 6, "new data = remove ['point', 10) from [0, 10)"), Interval(5, 6, "new data = remove ['point', 15) from [5, 15)"), Interval(6, 10, "new data = remove [0, 'point') from [0, 10)"), Interval(6, 15, "new data = remove [5, 'point') from [5, 15)")] ```
Solution inspired by @chaimleib
def slice_the_tree(t, point):
def datafunc(sliced_interval, islower):
a, b, _ = sliced_interval
if islower:
return f"new data = remove [{point}, {b}) from [{a}, {b})"
return f"new data = remove [{a}, {point}) from [{a}, {b})"
t.slice(point, datafunc)
t = IntervalTree([Interval(0, 10), Interval(5, 15)])
slice_the_tree(t, 6)
```python [Interval(0, 6, 'new data = remove [6, 10) from [0, 10)'), Interval(5, 6, 'new data = remove [6, 15) from [5, 15)'), Interval(6, 10, 'new data = remove [0, 6) from [0, 10)'), Interval(6, 15, 'new data = remove [5, 6) from [5, 15)')] ```
Solution inspired by @jeffwu78
def extended_datafunc(sliced_interval, islower, point):
a, b, _ = sliced_interval
if islower:
return f"new data = remove [{point}, {b}) from [{a}, {b})"
return f"new data = remove [{a}, {point}) from [{a}, {b})"
t = IntervalTree([Interval(0, 10), Interval(5, 15)])
t.slice(6, lambda sliced_interval, islower: extended_datafunc(sliced_interval, islower, 6))
```python [Interval(0, 6, 'new data = remove [6, 10) from [0, 10)'), Interval(5, 6, 'new data = remove [6, 15) from [5, 15)'), Interval(6, 10, 'new data = remove [0, 6) from [0, 10)'), Interval(6, 15, 'new data = remove [5, 6) from [5, 15)')] ```
More legible adaptation
This solution uses the extended_datafunc()
from the previous solution but uses partial initialization instead of the lambda:
from functools import partial
t = IntervalTree([Interval(0, 10), Interval(5, 15)])
t.slice(6, partial(extended_datafunc, point=6))
The chop datafunc has access to the old interval and whether it is a lower or upper chop, but it does not have access to the requested chop interval. This seems like a major oversight. Am I missing something?
To clarify, in the example given the documentation
datafunc is given access to the old interval (0, 10), but not the actual chop interval (3, 7).