Ralith / hecs

A handy ECS
Apache License 2.0
967 stars 82 forks source link

Introduce QueryBorrow::iter_combinations #368

Closed Moulberry closed 6 months ago

Moulberry commented 6 months ago

Hey there, very nice project. As far as I'm aware, there currently isn't a way to safely and efficiently iterate over combinations of entities. This PR introduces QueryBorrow::iter_combinations, allowing for just that.

The operation is useful in situations where you need entities to interact with each other eg. collision between entities.

This implementation is based on the description of the same operation in bevy_ecs: https://github.com/bevyengine/bevy/blob/31d91466b4fb99786a8c61b82417372a42a3f531/crates/bevy_ecs/src/query/iter.rs#L546.

The API currently returns [(Entity, Q::Item); K] as I felt it was consistent with the other query iterator. This can easily be changed to ([Entity; K], [Q::Item; K]) for consistency with the query_many family of functions

Please let me know if you have any questions or changes.

I have signed the Google CLA linked in the CONTRIBUTING.md

Cheers, Moulberry

Ralith commented 6 months ago

Thanks for the PR! This pattern is already possible by combining queries like With<(), Q>, which don't borrow anything and can hence be freely nested, and using the resulting Entity handles to index into a View. Does that satisfy your use case?

I'm also reluctant to promote this sort of pattern because O(n^K) time complexity can get out of hand very quickly. There's usually a way to construct or maintain a much smaller list of possible interactions. Outside of very small games, you should prefer to keep a spatial index on the side for collision detection, for example.

Moulberry commented 6 months ago

Hi, thank you for your response. Hopefully I have understood your suggestion as highlighted below.

For pairs of entities (which I imagine to be the most common case), iterating over all permutations using nested With<(), Q> loops, filtering out the duplicates by ensuring a < b, then using get_many_mut([a, b]) ends up being 1.64x slower than iter_combinations. However dropping down to unsafe and using get_unchecked (as below) results in the manual approach being 1.59x faster.

For triples of entities, iter_combinations is always faster, but this case shouldn't be too common.

As I understand one of the design goals of hecs is to be as simple as possible, and considering there are alternatives with similar performance characteristics, I'd say this PR can be closed if there's nothing else to look into. Thank you for your time.

fn iterate_manual(b: &mut Bencher) {
    let mut world = World::new();
    for i in 0..1000 {
        world.spawn((Position(i as f32),));
    }

    let mut query = PreparedQuery::<&mut Position>::new();
    let mut query = query.query(&world);
    let view = query.view();

    b.iter(|| {
        let mut sum = 0.0;
        for (a, _) in world.query::<With<(), &Position>>().iter() {
            for (b, _) in world.query::<With<(), &Position>>().iter() {
                if a >= b {
                    continue;
                }

                let entity_a = unsafe { view.get_unchecked(a) }.unwrap();
                let entity_b = unsafe { view.get_unchecked(b) }.unwrap();
                sum += entity_a.0 * entity_b.0;
            }
        }
        black_box(sum);
    });
}

fn iterate_combinations(b: &mut Bencher) {
    let mut world = World::new();
    for i in 0..1000 {
        world.spawn((Position(i as f32),));
    }

    b.iter(|| {
        let mut query = world.query::<&mut Position>();
        let mut combinations = query.iter_combinations::<2>();

        let mut sum = 0.0;
        while let Some([(_, pos1), (_, pos2)]) = combinations.next() {
            sum += pos1.0 * pos2.0;
        }
        black_box(sum);
    });
}
Ralith commented 6 months ago

Thanks for the analysis! I didn't realize the code here was skipping redundant tuples. It's interesting that a full cross-product is comparably performant, let alone faster, in that light. Maybe LLVM's cleverly optimizing the continue. We could explore ways to optimize this further, but given that the best optimization will always be to avoid cross products entirely, I'll hold off on that in the absence of a compelling use case.