JuliaMath / IntervalSets.jl

Interval Sets for Julia
https://juliamath.github.io/IntervalSets.jl/
Other
99 stars 26 forks source link

Support union of multiple intervals #156

Open putianyi889 opened 1 year ago

putianyi889 commented 1 year ago

This also provides a way to simplify DomainSets.UnionDomain of intervals.

codecov[bot] commented 1 year ago

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (13c76e1) 99.11% compared to head (a099521) 98.62%. Report is 13 commits behind head on master.

Files Patch % Lines
src/interval.jl 90.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #156 +/- ## ========================================== - Coverage 99.11% 98.62% -0.49% ========================================== Files 6 7 +1 Lines 225 291 +66 ========================================== + Hits 223 287 +64 - Misses 2 4 +2 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

dlfivefifty commented 1 year ago

Is the restriction to TypedEndpointsInterval needed anymore?

putianyi889 commented 1 year ago

The function calls _union. Is that supposed to be a generic method?

putianyi889 commented 7 months ago

The implementation is now to use an iterator-based method to merge sorted intervals. The bottleneck is sorting.

SortingNetworks.jl is fast but doesn't support sorting different types, so it's abandoned.

TupleTools.jl is faster when there are <=6 elements. Then the only available method is Base.sort on vectors. I choose static vector after some benchmarking.

For the union of 2 intervals, the old method is still 25% faster, so it's reserved.

putianyi889 commented 6 months ago

Important changes:

There is one uncovered line.

union(d::TypedEndpointsInterval) = d # 1 interval

It's to avoid calling

union(I::TypedEndpointsInterval...) = iterunion(sort!(collect(I); lt = leftof)) # ≥21 intervals

in that case, so I don't think it's worth testing.

aplavin commented 6 months ago

Splatting is reasonably used only for small numbers of arguments in Julia. Then, the most straightforward dependency-free implementation is even more performant than this PR:

julia> ints = Tuple(i..(i+1) for i in 1:10)
(1 .. 2, 2 .. 3, 3 .. 4, 4 .. 5, 5 .. 6, 6 .. 7, 7 .. 8, 8 .. 9, 9 .. 10, 10 .. 11)

# this PR:
julia> @btime union($ints...)
  255.479 ns (23 allocations: 1.52 KiB)
1 .. 11

# naive implementation:
julia> myunion(ints...) = reduce(union, ints)

julia> @btime myunion($ints...)
  18.556 ns (0 allocations: 0 bytes)
1 .. 11

UPD: ah, I see, it does require sorting... I heard Julia is adding tuple sorting soon though?

aplavin commented 6 months ago

Yeah, don't know the current state of the Julia PR https://github.com/JuliaLang/julia/pull/46104, but copying tuple sorting from there

```julia function tsort(x::NTuple{N}; lt::Function=isless, by::Function=identity, rev::Union{Bool,Nothing}=nothing, order::Base.Ordering=Base.Forward) where N o = Base.ord(lt,by,rev,order) issorted(x, o) ? x : _sort(x, o) end _sort(x::Union{NTuple{0}, NTuple{1}}, o::Base.Ordering) = x function _sort(x::NTuple, o::Base.Ordering) a, b = Base.IteratorsMD.split(x, Val(length(x)>>1)) merge(_sort(a, o), _sort(b, o), o) end merge(x::NTuple, y::NTuple{0}, o::Base.Ordering) = x merge(x::NTuple{0}, y::NTuple, o::Base.Ordering) = y merge(x::NTuple{0}, y::NTuple{0}, o::Base.Ordering) = x # Method ambiguity merge(x::NTuple, y::NTuple, o::Base.Ordering) = (Base.lt(o, y[1], x[1]) ? (y[1], merge(x, Base.tail(y), o)...) : (x[1], merge(Base.tail(x), y, o)...)) ```

I get:

julia> myunion(ints...) = reduce(union, tsort(ints; by=i -> (leftendpoint(i), isleftopen(i))))
julia> @btime myunion($ints...)
  59.723 ns (0 allocations: 0 bytes)
putianyi889 commented 6 months ago

Yeah, don't know the current state of the Julia PR JuliaLang/julia#46104, but copying tuple sorting from there

I get:

julia> myunion(ints...) = reduce(union, tsort(ints; by=i -> (leftendpoint(i), isleftopen(i))))
julia> @btime myunion($ints...)
  59.723 ns (0 allocations: 0 bytes)

If I understand correctly, it only supports type-stable tuple sorting, that is, all items must be of the exact same type.

Edit: https://github.com/JuliaLang/julia/pull/52010

By the way, that implementation is exactly the same as TupleTools.jl.