Command Batching for `Insert` and `Remove`.

rj00a commented 9 months ago

Whenever a component is added or removed from an entity, all of its components must move to a different archetype. This quickly becomes a performance issue when large numbers of components are added/removed in sequence, such as during entity initialization. To create an entity with N components, we must do N * (N - 1) / 2 needless component moves to add everything. Yikes!

bevy_ecs and other libraries address this problem with bundles. Bundles let us insert/remove sets of components at a time, removing all intermediate steps.

#[derive(Bundle)]
struct MyBundle {
    foo: A,
    bar: B,
    baz: C,
}

However, I dislike bundles for a few reasons.

There's no way to extend bundles outside their definition site. This has been an observed problem in Valence, where I've wanted to add components to bundles from different modules.
On their own, they don't address problems like spawning child entities or other initialization work.
Not clear how this would integrate with evenio's events.

Rather than adding ad-hoc features to the ECS, what if we optimize the features we already have? This is where batching comes in. Whenever an Insert or Remove event finishes broadcasting, we add it to a buffer instead of applying it immediately. Once a handler that could potentially observe the changes is about to run, we flush the buffer. This lets us turn O(N^2) entity initialization into a roughly O(N) operation.

To implement this, every SystemList contains the union of all components accessed by the systems in the list as a BitSet<ComponentIdx>. We also have another BitSet<ComponentIdx> associated with the buffered events to track which components have been changed. Before we run through a SystemList, we check if the system list bitset overlaps with the buffer bitset. If it does, then the buffer needs to be flushed. Flushing the buffer involves sorting by component ID, a deduplication pass, and finally moving the entity to the destination archetype.

If an entity has a component added or removed, the SystemList associated with it may change. The batching process will have to traverse the archetype graph, tracking where the entity would be if there was no batching involved. For this reason, it only seems feasible to batch one entity at a time.

andrewgazelka commented 6 months ago

Can you re-iterate what your thoughts on batching for non-insert/remove events?

I had to add an event called RecvDataBulk

https://github.com/andrewgazelka/hyperion/blob/f519d4a9c48b6fb9eccf4f28bd25c2e6af2674a2/crates/server/src/system/ingress.rs#L106-L110

so that I could iterate in parallel. I am thinking event batching and being able to iterate over all might fix this issue to some extent. Alternatively, do you have any other opinions?

https://github.com/andrewgazelka/hyperion/blob/f519d4a9c48b6fb9eccf4f28bd25c2e6af2674a2/crates/server/src/system/ingress.rs#L247

rj00a commented 6 months ago

@andrewgazelka If you're asking for the ability to execute a set of handlers in parallel given a list of events then that's not really possible for a few reasons:

Handlers aren't Sync.
Handlers need &mut self to run.
Invoking a handler has some dynamic dispatch overhead + params like Single will need to do an entity lookup on every invocation.

So I think the best approach is to store your events in a collection and process them in parallel as you are currently doing.

andrewgazelka commented 6 months ago

@andrewgazelka If you're asking for the ability to execute a set of handlers in parallel given a list of events then that's not really possible for a few reasons:

Handlers aren't Sync.

Handlers need &mut self to run.

Invoking a handler has some dynamic dispatch overhead + params like Single will need to do an entity lookup on every invocation.

So I think the best approach is to store your events in a collection and process them in parallel as you are currently doing.

Hmm this is really disappointing to hear. I feel like this will make consuming the ECS I am laying out in Hyperion a lot more complicated.

Invoking a handler has some dynamic dispatch overhead + params like Single will need to do an entity lookup on every invocation.

makes sense

rj00a commented 6 months ago

Another more glaring issue is that you wouldn't be allowed to access any data mutably through those handlers because a handler could be running in parallel with itself. The accessed world data would need atomics and/or locks.

andrewgazelka commented 6 months ago

Another more glaring issue is that you wouldn't be allowed to access any data mutably through those handlers because a handler could be running in parallel with itself. The accessed world data would need atomics and/or locks.

My thought was rather you could have some type of ParSender that uses thread local (or similar) and then right when the system is about to return it synchronously sends all the events. Perhaps this doesn't make sense to do though? I might just use map and collect in rayon.

rj00a commented 6 months ago

Maybe that could work to some extent but then what happens after? Would serial handlers be able to handle events originating from parallel handlers? This also introduces nondeterminism into the control flow. You would also need some notion of sync points to do structural world changes since that can't be done in parallel.

andrewgazelka commented 6 months ago

hmmm I am not sure exactly how it would work. I think it is important to think about how events should work though because I am at the point where I am considering making every event a bundled event. Perhaps this is the best strategy now.

andrewgazelka commented 6 months ago

wait would it be possible to have a ReceiverMany and ReceiverManyMut or something similar which would kinda be like Fetcher but for events (allow multiple)? Would this speed things up? I might be misunderstanding what you said before.

rj00a commented 6 months ago

A multi receiver might be a nice convenience but it wouldn't help performance. This issue is only concerned with eliminating the O(N^2) behavior of repeated Insert and Remove, so that should probably be discussed elsewhere.

rj00a / evenio

Command Batching for `Insert` and `Remove`. #5