Open dpsanders opened 4 years ago
There should be a unify=true
keyword argument (true by default)
Hi! I decided to take a chance into this and found there is something weird with the implementation described above for 2 dimensions and above. The weird case is easily described with the next example
┌───────────┐
│ │
│ │
│ B₁ │
┌──────┼───┐ │
│ │ │ │
│ └───┼───────┘
│ │
│ B₂ │ ┌────┐
│ │ │ B₃ │
│ │ └────┘
└──────────┘
with the above algorithm the boxes are reduced to this
┌──────────────────┐
│ │
│ │
│ │
│ │
│ │
│ B₁∪B₂ │
│ │
│ ┌────┐ │
│ │ B₃ │ │
│ └────┘ │
└──────────────────┘
But I think thay should be reduced to just B₁∪B₂∪B₃
.
In #64 I add the unify keyword with a small but fast implementation of union-find (weighted and with path compression, reference on the algorithm here).
The implementation is quite fast, two examples are the next ones (first from the tutorial)
julia> using IntervalArithmetic, IntervalOptimisation, BenchmarkTools
julia> f(x) = x^2 - 2x + 1
f (generic function with 1 method)
julia> @time minval, minimisers = minimise(f, 0..2; tol=1e-6, unify=false);
3.626429 seconds (15.91 M allocations: 1.621 GiB, 8.09% gc time)
julia> @show length(minimisers)
length(minimisers) = 2990
2990
julia> @btime unified_minimisers = IntervalOptimisation.unify_boxes(minimisers)
29.142 ms (75 allocations: 117.45 KiB)
14-element Vector{Interval{Float64}}:
[0.998687, 1.00136]
[0.998642, 0.998687]
[0.998635, 0.998642]
[0.998623, 0.998627]
[1.00136, 1.00137]
[1.00135, 1.00137]
[1.00138, 1.00139]
[1.00137, 1.00138]
[1.00135, 1.00136]
[0.998632, 0.998634]
[0.998621, 0.998623]
[1.00137, 1.00138]
[1.00137, 1.00138]
[0.998617, 0.998619]
the other mine
julia> g(x) = abs(x[1]-x[2]) + (x[1]+x[2])^2
g (generic function with 1 method)
julia> @time minval, minimisers = minimise(g, (-1+1e-10..1)×(-1..1); tol=1e-10, unify=false); # initialize with these give a special bad behaviour
1.815546 seconds (8.21 M allocations: 747.165 MiB, 18.85% gc time)
julia> @show length(minimisers)
length(minimisers) = 27670
27670
julia> @btime unified_minimisers = IntervalOptimisation.unify_boxes(minimisers)
2.666 s (28 allocations: 946.11 KiB)
1-element Vector{IntervalBox{2, Float64}}:
[-3.80622e-07, 3.80622e-07] × [-3.80672e-07, 3.80625e-07]
One thing is that if the implementation is changed so that the example showed at the begginig returns B₁∪B₂∪B₃
, unify_boxes
can be written to be faster (now is cuadratic on the number of minimisers
, but it can be O(n log(n))
) and avoid the union-find algorithm altogether. (edit: typos and wording)
I'm open to thoughts and ideas. Hope is useful.
P.D. Kudos to asciiflow for the ascii drawing. :stuck_out_tongue:
I don't understand why the algorithm doesn't reduce your example to B₁ ∪ B₂ ∪ B₃
.
Is because the B₁∩B₃
and B₂∩B₃
is empty, even if (B₁∪B₂)∩B₃
is not, and the algorithm is just testing pairwise the intersections.
Nice catch! Does your version give the correct result?
A solution seems to be to modify the list of intervals as you go by taking the union (hull) when two intervals are combined, and checking the intersection with this modified list instead of the original list.
Hmmm. But actually in your original example we would prefer not to unify (B1 ∪ B2) with B3, since they are actually disjoint. Maybe taking the hull is actually not a good idea.
We can just return separate lists for each connected component and the user can take the hull if they prefer; or we can give a convenience function to do so.
Maybe we can add three options for unify, :hull
, :connected
and :none
?
I can add the algorithm for the hull too. I'll add it to the PR and make the changes on the documentation and add tests.
@Suavesito-Olimpiada: Sorry for the delay in replying. I think it would be great to have these different options. Is there any progress on this?
Note that there is also an implementation of disjoint sets in DataStructures.jl. I'm not sure how it compares to yours.
I can check the differences between my inplementation and theirs, also see performan differences and reply back later.
On the hull algorithm, I still don't have progress. But once I finish my semester I'll do it. :)
I think there should be a way to use interval trees for this. There is an implementation in Julia.
I'll check out both R-trees and Interval Trees, so I can understand better both structures. But as I see (just over-looking both articles) I think a variant of R-trees is closer to the hull idea (that is, so we can get the maximum number of non-intersecting hulls of intersecting intervals).
I think the idea of R-trees is to make it more efficient to check which other boxes a given box intersects.
From the R-Tree article
The key idea of the data structure is to group nearby objects and represent them with their minimum bounding rectangle in the next higher level of the tree
and from the Interval Tree article
...it allows one to efficiently find all intervals that overlap with any given interval or point
And then
The key idea is to use the bounding boxes to decide whether or not to search inside a subtree. In this way, most of the nodes in the tree are never read during a search
The bounding boxes are used to narrow down the set of existing boxes that a new box must be checked against, as far as I understand.
See Samet, Foundations of Multidimensional and Metric Data Structures, chap. 3
Ok, ok I think both structures are used to optimise the search of intervals that overlaps other intervals, but I need to check the differences. Thanks for the information and all the references, I'll check them out. :smiley:
There's an implementation using LightGraphs in https://github.com/JuliaIntervals/IntervalOptimisation.jl/pull/5
It doesn't seem so easy to write an implementation by hand. It's basically a union-find algorithm