Unsoundness in dynamic queries: sparse filters are ignored if the typed query is dense

SkiFire13 commented 1 month ago

Bevy version

Both 0.14.0 and af865e76a323fc87dd16157e7f4d7593d02c2b95

What you did

#[derive(Component)]
struct Dense;

#[derive(Component)]
#[component(storage = "SparseSet")]
struct Sparse;

let mut world = World::new();

world.spawn(Dense);
world.spawn((Dense, Sparse));

// How this is not working as intended:
let mut query = QueryBuilder::<&Dense>::new(&mut world)
    .without::<Sparse>()
    .build();
// Prints 2 but only 1 entity should be matched.
dbg!(query.iter(&world).count());

// How this is unsound:
fn sys(q1: Query<&mut Dense, With<Sparse>>, q2: Query<(Entity, &mut Dense)>) {
    for (entity, _) in &q2 {
        assert!(!q1.contains(entity));
    }
}
SystemBuilder::<(Query<&mut Dense, With<Sparse>>,)>::new(&mut world)
    .builder::<Query<(Entity, &mut Dense)>>(|builder| {
        builder.without::<Sparse>();
    })
    .build(sys)
    .run((), &mut world);

What went wrong

In the simple demonstration the query contains 2 entities, but only 1 should match all the filters. The reason both are yielded is because the static part of the query is dense and so the dense iteration optimization kick in and iterates over the table matching it. But doing so completly ignores the filter on the sparse component.

In the soundness issue this is exploited in a dynamic system where the sparse filter is used to make the two queries disjoint, which is however broken by the bug just showed.

alice-i-cecile commented 1 month ago

As mentioned on Discord, this is effectively https://github.com/bevyengine/bevy/pull/13120 again.

SkiFire13 commented 1 month ago

As mentioned on Discord, this is effectively #13120 again.

The difference is that for #13120 it was just one of the blockers for merging, while this issue is present on an already released version.

cBournhonesque commented 1 month ago

So the problem is that queries use static analysis on the type of the query to check if all components are dense, in which case it iterates through the table instead of the archetype?

Would the solution be to check if the query is dense by looking at the Access dynamically instead of looking at the types of D and F?

SkiFire13 commented 1 month ago

So the problem is that queries use static analysis on the type of the query to check if all components are dense, in which case it iterates through the table instead of the archetype?

Yes. Basically the (wrong) assumption is that if all the static types and filters are dense then the query will actually access all and only those entities in the corresponding tables (and thus they can directly iterate over them, gaining better cache locality and even potential for autovectorization), but that assumption is violated by the query not having access to some of the archetypes in those tables when built dynamically (or when transmute/transmute_lens/etc etc are used, see #14528).

Would the solution be to check if the query is dense by looking at the Access dynamically instead of looking at the types of D and F?

It would solve the soundness issue, but it would also require introducing a branch in every call to the very hot QueryIter::next which would likely be devastating for performance.

cBournhonesque commented 1 month ago

It would solve the soundness issue, but it would also require introducing a branch in every call to the very hot QueryIter::next which would likely be devastating for performance.

Could something like:

at QueryIter creation time (QueryIter::new()) we create either a QueryArchetypeIterationCursor or QueryTableIterationCursor depending on the access, which is checked dynamically
then the hot path of Cursor::next() wouldn't have branching

work?

SkiFire13 commented 1 month ago

No, Cursor::next() would still have a branch due to checking which cursor it actually is. Ultimately the two types of iteration do different things, so if you want to make the selection dynamic you will have a branch in next.

cBournhonesque commented 1 month ago

No, Cursor::next() would still have a branch due to checking which cursor it actually is. Ultimately the two types of iteration do different things, so if you want to make the selection dynamic you will have a branch in next.

I wouldn't have any dynamic checks in the cursor because I have 2 different types of Cursors. DenseCursor vs ArchetypeCursor. Maybe I can try it.

EDIT: ah i see, you would have to check which cursor to advance..

SkiFire13 commented 1 month ago

A solution which would keep the dense iteration optimization may be to add a new marker parameter to Query and QueryState which tells it whether it can trust the static filters or not. By default it would be to trust the static filters, and thus the dense iteration optimization would be possible, but any operation that may introduce dynamic ones will have to return a Query/QueryState with the other marker where it is not.

The downside of this approach is that now we have two incompatible kind of Query, though we could add a zero-cost way to convert a static one to a dynamic one to ease the reduction in usability.

cBournhonesque commented 1 month ago

And doing something like

enum Cursor {
   Dense(DenseCursor),
   Archetype(ArchetypeCursor)
}

would not work because the check of which type of cursor we are is still done dynamically?

I was hoping that after the Cursor creation, the other functions would just be called statically. We only need to check the is_dense at cursor creation, not after

SkiFire13 commented 1 month ago

would not work because the check of which type of cursor we are is still done dynamically?

Yes

I was hoping that after the Cursor creation, the other functions would just be called statically. We only need to check the is_dense at cursor creation, not after

In the end you want to do different things in next depending on whether the query is actually dense or not, so you need a branch there, there's no way around that. Moving the dense check at cursor creation just changes what the branch in next will be, but ultimately you need a branch or to let go of the dense iteration optimization.

bevyengine / bevy