Open kvark opened 9 years ago
Just a general outline before we start chopping things down:
I can think of two different types of systems ordering:
Vec<System>
). The next system may need to wait if it conflicts with any systems that are already in flight.@HeroesGrave, does your ECS support simultaneous processing? How do its rules fare against the model I described?
There are other ways, it seems. One that I particularly like is described on reddit:
The obvious benefit is having all systems running in parallel. Each component storage will accumulate changes and then apply them (again) in parallel to other components, after all the systems are done. One global change list will also store all the changes to the entity array itself.
The less obvious benefit is simplification of all the code. There is no need to specify at compile time which components are affected by which systems. Most of the implementation can be done out of the macros. This makes it by far my best bet at accomplishing the subject task.
I found a similar issue for ecs-rs: https://github.com/HeroesGrave/ecs-rs/issues/10
@mtsr - do you have an example where a system would like to run on more than a single thread?
I slowly drifting towards realization that the parallel story of ECS is just a sub-problem of generic work queues. In a simple case, a system is just a work unit, and its dependencies may be other systems. The ECS should just support change lists natively, so that processing it in parallel is possible.
A system is a work unit, but might also have it’s own child work units that it depends on. That’s mostly it, I think. That way the system can also split it's work across threads, as there's definitely use cases for that.
There are some other neat advantages to the changeset model. A good example being that changesets can be forked and fed into multiple components of the system. Collision detection and Rendering both need to know when the objects positions has been changed, but both need to represent it in a different way from each other.
Rendering is actually a good example of where there is quite a bit that can be gained from the changeset model. Updating a texture or a model using the changeset allows the render to update only the data is modified with each frame. The alternative is the Render running through the list of meshes, and looking at dirty flags. Changesets are a list of dirty entities.
@csherratt your vision on changelists is quite an eye-opener. It does seem beneficial to have collision and rendering accessing their changelists. So is this how this may work?
On the other hand, as @mtsr has mentioned on IRC, changelist model seems to introduce latency. If we have a dependency path of N nodes (input -> physics -> rendering), then the output will only be completed in N updates, which is more than I can tolerate.
@kvark There are three ways I can see architected. The first is that the pipeline is feed once per frame, and propagates to each on a frame update. This has the nice advantage of letting the pipeline run in parallel, but sucks because every stage introduces a full frame of latency. This can be mitigated by increasing the framerate from 60 to 200 or so, since each stage is now running for 5ms instead of 16. The render could also throw out stale frames if it wanted to. I'm not a big fan of this architecture, Its what I first thought of in snowmew and have come to see the trades off as unacceptable.
The second idea, what I purposed in snowstorm gist. Basically, a pipeline that changes are propagated as they are created. Systems are run when all the writers are done writing. And since a system is a writer itself, this propagates like a wave through the systems. This gets around the latency problem since the updates all happen sequentially. The big issue with this type of system is that the worst case system, one where each system chains off the next, is going to be slower then if it were written as a single threaded program. The other big trade off is that this is expensive from a memory point of view. Each system has its own copy of everything + everything exists in a channel somewhere.
Lastly, based on the reddit link you provided. The ECS keeps all state in a central location. All updates to that blob are committed after all the workers complete. The amendment I had was that updates don't need to have a 1:1 mapping with a component they are updating. The position update is a good example of this, since it is relevant to a few systems. What I purpose is that a changeset could have a 1:N mapping, where multiple data-structures were updated inside of the central blob. So changing a position would not only update the position table, but also update a BVH which was used for render culling. I think this architecture could still be pipelines, so that each system can fan-out to do its updates to the central ECS. It's not a free lunch of course, changsets creating and consumption takes time.
I really like the snowstorm approach. Lots of parallelism, change sets reduce what needs to be processed (like creation and change events in EntityX for example enable) and allow for system-specific data-structures for stuff. Together with a good scheduler that knows about frame deadlines (or at least prioritises work-units belonging the next frame), it could be as open as anything.
Dependency chains aren't a problem since we pass the data through the whole chain every frame, although there might still be valid uses that need an upstream system to process (which incurs a frame-delay).
The key issue is interesting, though, but I think there might be solutions. Since every entity processed by a system is either coming from upstream or being created there, we just need to have uniqueness per system and tag them with the system that created them. Still no nicely contiguous entities, or at best contiguous per creating system.
We might still look at something like Tharsis' design, which rewrites a packed array everything every frame. Whether grouped by system or component or something else, I don't know,.
@mtsr mentioned parallelism in #2. Current systems do not allow that.
I have no idea how this is supposed to work, and I need to do more research to see clearer. Perhaps, someone could enlighten me? I do understand that components arrays can be moved between threads, if needed, but not sure how to resolve access conflicts between systems in this case. Also
COW
idiom seems to be related to the problem.When there is a big challenge, there is also a great opportunity. I hope we might come up with a good architecture for parallel systems - they are crucial for ECS scalability and to prove Rust being worthy in game-dev.