Ralith / sieve-tree

6 stars 1 forks source link

Some intersections are missed (part 2) #11

Closed grovesNL closed 4 days ago

grovesNL commented 5 days ago
#[test]
fn missed() {
    let mut t = SieveTree::<1, 2, Bounds<1>>::with_granularity(1.);
    let b1 = Bounds {
        min: [5.],
        max: [10.],
    };
    let b2 = Bounds {
        min: [20.],
        max: [25.],
    };
    t.insert(b1, b1);
    t.insert(b2, b2);

    let intersections = t
        .intersections(Bounds {
            min: [15.],
            max: [45.],
        })
        .map(|(_, bounds)| *bounds)
        .collect::<alloc::vec::Vec<_>>();

    assert_eq!(
        intersections,
        alloc::vec![Bounds {
            min: [20.],
            max: [25.],
        }],
    );
}

Intersections for bounds of 15..45 should hit 20..25

grovesNL commented 5 days ago

I noticed that we don't seem to grow the bounds of the root node around the second b2. self.coords remains at world bounds of Bounds { min: [5.0], max: [21.0] } after inserting b1, but I'd expect it to grow so that the maximum is bigger or equal to 25 (using whatever the multiple ends up being).

In ensure_contains we check bounds.min.iter().zip(&current.min).any(|(x, y)| x < y) and grow it to include the minimum, but I don't see a place to handle the maximum (I expected it to happen in either ensure_contains or find_smallest_parent). I guess for this case we'd want to be able to add another level before the existing root.

grovesNL commented 5 days ago

If I swap around the inserts in the test case then it works fine. The failing case places the root at level 4 while the working case places the root at level 5.

Here's the tree for the failing test case (b1 inserted first, then b2):

SieveTree {
    granularity: 1.0,
    root: Some(
        Root {
            embedding: Embedding {
                origin: [
                    5.0,
                ],
            },
            coords: CellCoords {
                min: [
                    0,
                ],
                level: 4,
            },
            node: Node {
                state: Leaf {
                    unsieved_elements: 0,
                },
                elements: {
                    [
                        0,
                    ]: 1,
                },
            },
        },
    ),
    elements: {
        0: Element {
            value: Bounds {
                min: [
                    5.0,
                ],
                max: [
                    10.0,
                ],
            },
            next: None,
        },
        1: Element {
            value: Bounds {
                min: [
                    20.0,
                ],
                max: [
                    25.0,
                ],
            },
            next: Some(
                0,
            ),
        },
    },
}

and the working test case (b2 inserted first, then b2):

SieveTree {
    granularity: 1.0,
    root: Some(
        Root {
            embedding: Embedding {
                origin: [
                    4.0,
                ],
            },
            coords: CellCoords {
                min: [
                    0,
                ],
                level: 5,
            },
            node: Node {
                state: Internal {
                    children: [
                        Node {
                            state: Leaf {
                                unsieved_elements: 0,
                            },
                            elements: {
                                [
                                    0,
                                ]: 1,
                            },
                        },
                        Node {
                            state: Leaf {
                                unsieved_elements: 0,
                            },
                            elements: {
                                [
                                    0,
                                ]: 0,
                            },
                        },
                    ],
                },
                elements: {},
            },
        },
    ),
    elements: {
        0: Element {
            value: Bounds {
                min: [
                    20.0,
                ],
                max: [
                    25.0,
                ],
            },
            next: None,
        },
        1: Element {
            value: Bounds {
                min: [
                    5.0,
                ],
                max: [
                    10.0,
                ],
            },
            next: None,
        },
    },
}
grovesNL commented 5 days ago

Something else I noticed is that ensure_contains is pretty careful to decrease embedding.origin and offset coords.min by the same amount to avoid reindexing when growing the root's minimum bounds (it looks like { origin: 20, min: 0 } then { origin: 4, min: 16 } in the working test case).

In contrast, when we add a new level in find_smallest_parent we keep the origin but don't persist the minimum that we were using ({ origin: 4, min: 16 } at level 4 becomes { origin: 4, min: 0 } at level 5). We do pass the minimum to child_index_at_level, but only using that minimum to find the correct child index at the new root.

Edit: This seems ok after trying to understand this some more. The new minimum of 0 at level 5 came from the bounds being passed to insert, and the old minimum of 16 shouldn't matter as long as the new child index is right.

Ralith commented 4 days ago

Expansion of the tree's area is handled by find_smallest_parent. It's documented as:

/// Look up the smallest existing parent of `target`, uprooting the tree if necessary

"Uprooting" here means "introducing a new root that is the smallest common ancestor of the current root and the target, then reattaching the current root at the proper location under the new root". That's what happens when the if ancestor == root.coords { early exit doesn't get taken. In particular:

let old_root_coords = mem::replace(&mut root.coords, ancestor);

is responsible for updating coords to reflect the larger area.

Ralith commented 4 days ago

Also note that different insertion order is expected to lead to differently structured trees, since the tree is always both as small as possible to contain the current elements, and never moves existing elements on insert, to avoid doing a bunch of redundant work when bulk-inserting. The difference should be very small once you've inserted more than a handful of elements and balanced the tree once.

grovesNL commented 4 days ago

Ok great, that makes sense. ensure_contains is mostly about shifting the minimum bounds as needed, and insertion will handle any expansion necessary.

Ralith commented 4 days ago

Fixed in eedee1808b7cc6123246fa3df10762a0cf730dc9.