Closed Huite closed 1 year ago
I have a feeling the issue is found in the construction.
Maybe something with special case here: https://github.com/Deltares/numba_celltree/blob/main/numba_celltree/creation.py#L309
Or the merging thereafter.
The problem is indeed a small error in the construction.
See the visual example from the paper (here with 5 buckets):
Faulty code:
# plane is the separation line to split on:
# 0 [bucket0] 1 [bucket1] 2 [bucket2] 3 [bucket3]
plane = split_plane(buckets, root, range_Lmax, range_Rmin, bucket_length)
right_index = buckets[plane].index
right_size = root.ptr + root.size - right_index
left_index = root.ptr
left_size = root.size - right_size
nodes[root_index]["Lmax"] = buckets[plane - 1].Lmax
nodes[root_index]["Rmin"] = buckets[plane].Rmin
The error lies in the last two lines. It assumes that the left and right boundaries of the bounding boxes contained in the buckets (Lmax, Rmin) are monotonic with the buckets. This is not guaranteed: the bounding boxes are placed in the buckets based on centroid of the bounding box. When the grid contains both small and large cells (and thus bounding boxes), it's possible that the leftmost bucket contains an Lmax that is to the left of the second bucket.
That is exactly what happens when n_buckets=4
, in this case for node 71:
Bucket(Max=-0.0307499975, Min=-0.04633333, Rmin=-0.04633333, Lmax=-0.01633333, index=53, size=1)
Bucket(Max=-0.015166665, Min=-0.0307499975, Rmin=-0.03733333, Lmax=-0.02319655, index=54, size=1)
Bucket(Max=0.00041666750000000224, Min=-0.015166665, Rmin=-0.02366667, Lmax=0.00733333, index=55, size=2)
Bucket(Max=0.016, Min=0.00041666750000000224, Rmin=-0.0065451, Lmax=0.016, index=57, size=2)
The second Lmax (-0.023) lies left of the first Lmax (-0.016). As the splitting plane is 2, the implementation above picks the value of -0.023 while -0.016 should be used instead.
Visually:
The bounding box of 1 is right of the bounding box of 68. But in terms of centroids, 1 is left of 68.
This is easily resolved by taking the maximum value of Lmax left of the splitting plane, and the minimum value of Rmin right of the splitting plane:
def split_plane(
buckets: List[Bucket],
root: np.void,
range_Lmax: float,
range_Rmin: float,
bucket_length: float,
):
plane_min_cost = FLOAT_MAX
plane = INT_MAX
bbs_in_left = 0
bbs_in_right = 0
# if we split here, lmax is from bucket 0, and rmin is from bucket 1 after
# computing those, we can compute the cost to split here, and if this is the
# minimum, we split here.
for i in range(1, len(buckets)):
current_bucket = buckets[i - 1]
next_bucket = buckets[i]
bbs_in_left += current_bucket.size
bbs_in_right = root.size - bbs_in_left
left_volume = (current_bucket.Lmax - range_Rmin) / bucket_length
right_volume = (range_Lmax - next_bucket.Rmin) / bucket_length
plane_cost = left_volume * bbs_in_left + right_volume * bbs_in_right
if plane_cost < plane_min_cost:
plane_min_cost = plane_cost
plane = i
Lmax = FLOAT_MIN
Rmin = FLOAT_MAX
for i in range(plane):
bLmax = buckets[i].Lmax
if bLmax > Lmax:
Lmax = bLmax
for i in range(plane, len(buckets)):
bRmin = buckets[i].Rmin
if bRmin < Rmin:
Rmin = bRmin
return plane, Lmax, Rmin
See also VTK implementation, which does the same:
The voronoi tesselation of this grid:
Produced with:
Or statically:
This produces an incorrect value of -1 for:
While it is clearly located in the grid.
Triangulating the grid seems to resolve the issue, but it raises the question why it fails. From walking through the tree interactively with the debugger, it takes a wrong left/right decision and ends up in a wrong leaf node. The point_in_polygon algorithms does give the correct result, but it's never reached.