Open rj00a opened 9 months ago
Can you re-iterate what your thoughts on batching for non-insert/remove events?
I had to add an event called RecvDataBulk
so that I could iterate in parallel. I am thinking event batching and being able to iterate over all might fix this issue to some extent. Alternatively, do you have any other opinions?
@andrewgazelka If you're asking for the ability to execute a set of handlers in parallel given a list of events then that's not really possible for a few reasons:
Sync
.&mut self
to run.Single
will need to do an entity lookup on every invocation.So I think the best approach is to store your events in a collection and process them in parallel as you are currently doing.
@andrewgazelka If you're asking for the ability to execute a set of handlers in parallel given a list of events then that's not really possible for a few reasons:
- Handlers aren't
Sync
.- Handlers need
&mut self
to run.- Invoking a handler has some dynamic dispatch overhead + params like
Single
will need to do an entity lookup on every invocation.So I think the best approach is to store your events in a collection and process them in parallel as you are currently doing.
Hmm this is really disappointing to hear. I feel like this will make consuming the ECS I am laying out in Hyperion a lot more complicated.
Invoking a handler has some dynamic dispatch overhead + params like Single will need to do an entity lookup on every invocation.
makes sense
Another more glaring issue is that you wouldn't be allowed to access any data mutably through those handlers because a handler could be running in parallel with itself. The accessed world data would need atomics and/or locks.
Another more glaring issue is that you wouldn't be allowed to access any data mutably through those handlers because a handler could be running in parallel with itself. The accessed world data would need atomics and/or locks.
My thought was rather you could have some type of ParSender
that uses thread local (or similar) and then right when the system is about to return it synchronously sends all the events. Perhaps this doesn't make sense to do though? I might just use map
and collect
in rayon
.
Maybe that could work to some extent but then what happens after? Would serial handlers be able to handle events originating from parallel handlers? This also introduces nondeterminism into the control flow. You would also need some notion of sync points to do structural world changes since that can't be done in parallel.
hmmm I am not sure exactly how it would work. I think it is important to think about how events should work though because I am at the point where I am considering making every event a bundled event. Perhaps this is the best strategy now.
wait would it be possible to have a ReceiverMany
and ReceiverManyMut
or something similar which would kinda be like Fetcher
but for events (allow multiple)? Would this speed things up? I might be misunderstanding what you said before.
A multi receiver might be a nice convenience but it wouldn't help performance. This issue is only concerned with eliminating the O(N^2) behavior of repeated Insert
and Remove
, so that should probably be discussed elsewhere.
Whenever a component is added or removed from an entity, all of its components must move to a different archetype. This quickly becomes a performance issue when large numbers of components are added/removed in sequence, such as during entity initialization. To create an entity with N components, we must do N * (N - 1) / 2 needless component moves to add everything. Yikes!
bevy_ecs
and other libraries address this problem with bundles. Bundles let us insert/remove sets of components at a time, removing all intermediate steps.However, I dislike bundles for a few reasons.
evenio
's events.Rather than adding ad-hoc features to the ECS, what if we optimize the features we already have? This is where batching comes in. Whenever an
Insert
orRemove
event finishes broadcasting, we add it to a buffer instead of applying it immediately. Once a handler that could potentially observe the changes is about to run, we flush the buffer. This lets us turnO(N^2)
entity initialization into a roughlyO(N)
operation.To implement this, every
SystemList
contains the union of all components accessed by the systems in the list as aBitSet<ComponentIdx>
. We also have anotherBitSet<ComponentIdx>
associated with the buffered events to track which components have been changed. Before we run through aSystemList
, we check if the system list bitset overlaps with the buffer bitset. If it does, then the buffer needs to be flushed. Flushing the buffer involves sorting by component ID, a deduplication pass, and finally moving the entity to the destination archetype.If an entity has a component added or removed, the
SystemList
associated with it may change. The batching process will have to traverse the archetype graph, tracking where the entity would be if there was no batching involved. For this reason, it only seems feasible to batch one entity at a time.