Background

Sometimes, I want to run multiple queries or systems in parallel, not just a single query, and track dependencies between them.

A very mature version of this feature might look something like Unity's Job Dependencies. Arch.Extended could build on top of that with something like Unity's Dependency property which automatically tracks dependencies between parallel queries across systems and sets up the job dependencies accordingly.

But even without that magic dependency finding, any way of running multiple queries in parallel, even if handles would need to be tracked manually, would be appreciated! i.e. a simple example would be var handle = world.ParallelQuery(...., DependencyJobHandle) that returns a JobHandle itself for further scheduling. These chains could then be tracked across systems for the intrepid user (or by Arch.Extended of course).

Is something along these lines planned? Am I missing something that's already possible? I know parallelization isn't supported for sourcegen yet, but anything for regular queries?

Discussed in https://github.com/genaray/Arch/discussions/145 Originally posted by LilithSilver September 29, 2023

Idea

The idea is quite simple and has already been described in the background. Querys should also have a possibility to run in parallel and in dependency. There could be different syntaxes for this.

var handle = world.ParallelQuery(...);   // Receives job
world.ParallelQuery(..., handle);            // Run query that depends on this job

Or for Arch.Extended


// Generates and runs this query, Assigns an identifier
[ParallelQuery("movement"]
public void MovementQuery(...){

}

// Generates and runs this query, assign identifier and requires movement as a dependency. 
[ParallelQuery("networking", "movement")]
public void NetworkQuery(){

}

State

In the meantime, lillith had submitted some great PRs to bring the job scheduler up to speed. Among them Dependencys inserted and some improvements made. https://github.com/genaray/ZeroAllocJobScheduler

Just wanted to chime in, I've been mocking around a scheduler for building automatically parallel systems with dependency management like this. Hasn't gotten anywhere yet besides building the dependency graph so can't provide a lot of input. But there's some great existing reference material from bevy and its stageless design.

https://hackmd.io/@alice-i-cecile/SJvmN1rAi https://github.com/bevyengine/rfcs/blob/main/rfcs/45-stageless.md

Basic premise is you have SystemSets which can be ordered and configured the same as Systems, then any System can be put in a SystemSet and it will inherit all the configuration of the SystemSets it belongs to.

This lets you easily group up systems and define rough dependencies without having to know the exact system ordering. E.g. Instead of having to say MyRenderTextures system has to happen after MyMoveSprites, you can have a LogicSet and a RenderSet, put the systems in there, and say LogicSet happens after RenderSet.

Now that ZeroAllocJobScheduler is getting to a pretty good place, I have some thoughts on the dependency API.

First off -- We need two separate options for parallel queries: one to schedule a query along with other queries, and another to actually run a single query in parallel with IJobParallelFor. Unity, for example, has Run() (for main-thread execution), Schedule() (for scheduled execution), and ScheduleParallel() (for scheduled execution + IJobParallelFor).

I'm not sure what that API would look like, exactly. Since we have 3 ways of creating queries, that would mean 9 different versions to implement, which isn't fun. It'd be nice to get something a bit more unifying, but I'm not sure how.

But that aside: I think a good (eventual?) goal would be to implement automatic dependency tracking (I think this is similar to what @xentripetal is talking about, but I'm not sure since I've never used Bevy, so I don't really understand those pages). But Unity does have a fairly nice API for handling dependencies in a fully-automatic way.

The idea is to track components accessed, and whether they're accessed in read mode or write mode. If we track the dependency graph, we can then automatically parallelize any code.

A few caveats...

Structural changes must force a sync point and require all scheduled queries up to that point to execute. We would need a way for a user to define whether structural changes are allowed in a given query, and not allow those to be in a scheduled context. Ideally this would have compile-time error-handling but runtime would work too.
- This is equivalent to the WithStructuralChanges() call in Unity which always provides a sync point.
Running a query on the main thread would first Complete() any queries that write any of the components it reads, or read/write any of the components it writes.
We'd of course need to provide a way to hook into the dependency system for people who want to run their own custom multithreaded jobs alongside queries.

This does bring up the question of our own API, though. I see a few options:

Integrate automatic dependencies into Arch.Extended, but not Arch.
- In Arch, we provide a super-basic way to schedule queries based on optional JobHandle dependencies in Arch, but nothing else. We provide no guaranteed race-condition handling or anything like that.
- In Arch.Extended, we leverage that bare-metal API to provide automatic dependency handling through BaseSystem.
- This means that each BaseSystem would need to be aware of its preceding system(s?), and the order in which systems are run in general. This would probably manifest as some sort of SystemSet object that handles organization.
- This would also give us a ton of freedom to mangle any code the user gives us, since sourcegen queries let us do whatever. For example, read/write on struct components can be inferred from in or ref.
- There's a big (huge) disadvantage, here, though: This system could no longer be used safely with Arch queries, without specific attention to ensure that run-order is maintained. I.e. you couldn't just run a query from anywhere without fear that it might conflict with scheduled queries from systems. This also means that queries are basically required to be within a system, always. That's a big restriction, and one I'm already breaking in my own project.
- Another caveat: Adding entities from the main thread causes a structural change and therefore necessitates sync point. How does Arch tell Arch.Extended this has happened, so that Arch.Extended can interrupt and wait for all its handles to Complete()? A BeforeStructuralChange event? Seems jank.
Add automatic dependency tracking directly to Arch.
- In Arch, we provide the ability to schedule a query alongside the basic API.
- The dependency tracking graph is done on a per-query basis, no matter where the query comes from. It is tracked globally, probably on the World.
- If structural changes are made, Arch will Complete() any handles before they can happen.
- If we make a basic main-thread query that conflicts with scheduled jobs, Arch will Complete() the conflicting handles before proceeding with the query.
- Components can be marked up as read or write via lambda attributes in a normal World.Query. (Though this may have some allocation issues -- we'd need to workshop it. Another option is to wrap read-only properties in a custom ReadOnly<T> marking struct; with a ref readonly getter. This would work to tag classes too, but it wouldn't enforce it for classes.)
- Arch.Extended can just improve the existing API with sourcegen, as it does now.
- The big advantage here is that it makes Arch inherently, easily parallelizable. You can query, either instantly or scheduled, without fear that it will cause any race conditions with any other queries, anywhere. It makes the first steps of parallelizing something sequential easy and accessible instead of a nightmare.
- The big disadvantage, here: It adds bloat to Arch. It means we have to do dependency tracking linked to the World, and if the user uses scheduled queries, we'd have to check for dependencies before running each main-thread query. Whenever we do structural changes, we have to sync everything up. Etc. (It could still be optional, of course -- none of this matters if the user just never starts a scheduled or parallel query).
Don't do automatic scheduling at all; let the user worry about it!
- Arch scheduled queries just pop out a JobHandle that the user must organize on their own.
- This would require the users to specifically manage their own dependency handles, pass them between systems, and manually check to make sure no race conditions could occur.
- Users would have to make sure to Complete() any handles they scheduled before they make structural changes. This gets messy, fast, and can lead to some difficult-to-diagnose race conditions.

I definitely prefer 1 or 2, and am loosely in favor of 2. That said, I worry that it adds bloat to Arch and makes its central API less beautifully simple (unless we can get a beautifully simple solution up and running).

But that aside: I think a good (eventual?) goal would be to implement automatic dependency tracking (I think this is similar to what @xentripetal is talking about, but I'm not sure since I've never used Bevy, so I don't really understand those pages). But Unity does have a fairly nice API for handling dependencies in a fully-automatic way.

Sorry, I explained myself poorly there. My input was on how explicit ordering could be defined from a user perspective. The docs I linked don't discuss how bevy handles system dependencies but how it builds a user defined schedule. It handles dependencies similar to how you're discussing, look at read/writes and determine what archetypes each system will hit and prevents R/W W/W conflicts. Though it also enforces everything is scheduled and prevents any conflicts from not having an explicit user defined order so there's no random side effects. But that's likely outside the scope of this.

The keypoint being I think there should be some way to refer to the dependencies without directly referencing the exact implementation of another system. There should be some sort of way to group multiple systems together and be able to reference that another system depends on that group without knowing all of the systems in it.

Additionally, I think it would be helpful to be able to define after/before relationships and not just before. The Unity Dependencies API only allows defining before dependencies.

Though both of these would require building an intermediate graph and allowing configuration on top of it, so I understand if it seems out of scope. Someone could just build a third party scheduler on top of the Arch solution that resolves all queries and its dependencies then translates it into Arch's JobHandle model.

I definitely prefer 1 or 2, and am loosely in favor of 2. That said, I worry that it adds bloat to Arch and makes its central API less beautifully simple (unless we can get a beautifully simple solution up and running).

My vote would also be on 2, it would allow that complex scheduling graph described above to be designed on Arch.Extended or some other third party lib while letting any standard Arch main thread queries be safe of race conditions without having to wrap them in some external dependency Complete() manager.

Basic premise is you have SystemSets which can be ordered and configured the same as Systems, then any System can be put in a SystemSet and it will inherit all the configuration of the SystemSets it belongs to.

That actually sounds pretty interesting! Arch.Extended/Arch.System features a group which could take care of this basically ^^

I definitely prefer 1 or 2, and am loosely in favor of 2. That said, I worry that it adds bloat to Arch and makes its central API less beautifully simple (unless we can get a beautifully simple solution up and running).

Well, this is really a difficult topic. Arch's main selling point is simplicity and bare-minimum. Adding dependency parallelization to arch would just bloat the code, make the API more complex, and probably make the API slower even if you don't want to do anything with multithreading (extra checks, sync points, and so on). So this way would undermine Arch "philosophy" and make it slower. Arch must not lose that, it must always be designed to be easily extended.

Instead I think the combination of approach 1 and 3 is best. We could simply rebuild or extend the ParallelQuery API to expose and return JobHandles. This way the user always keeps full control and can theoretically build his own management system around it. Additionally we could then simply lay out the tracking in Arch.Extended. Either in System.SourceGenerator or a new extension.

So we could also easily generate "synchronization points" and at the same time add methods to enforce them or add independently created jobs to tracking. An example of this would be...


[Parallel, Query]
[...]
public void CollisionChecking(...){
   ...
}

// From the Base system for manual control 
public void Update(...){

   World.Synchronize(); // A method the user could use to force a sync point manually.
   World.Query(...); // No problem here since we are synchronized. 

   var handle = CollsionCheckingQuery(); // We could also execute or complete them manually since the generated query returns a JobHandle? 
   // Since generated method, it gets automatically tracked of course

    var handle = World.ParallelQuery(...); // That would be in there by default as well 
    World.Track(handle, types?);              // make that manual parallel query track aswell since its not auto generated. 
   ...
}

The optional override of Update already works in the normal source generator. If the user does not add a custom implementation, this will be done automatically, of course.

So with this I think the user has the best of both worlds and Arch itself would not be bloated.

Yeah, that makes sense! The API is a little more complex than I would like, but I get what you mean.

Just brainstorming.... What if Arch.Extended extended World with a ScheduledWorld or something? Then we could run sync points before every structural change and track things just like solution 2. And for each regular Arch query we could make sure any dependencies are handled first.

It might be a pain to maintain... but it would give us a lot of power to make a very nice API. What do you think?

Edit: Wouldn't actually be a pain to maintain if we added the right hooks to Arch. For example, a protected nullable delegate BeforeStructuralChange, BeforeQuery, and ResolveQueryDependency that just wouldn't do anything if an overriding World didn't set them.

Edit edit: Kept thinking about this idea so I wrote out my thoughts and stuck them in a draft API document thing: https://gist.github.com/LilithSilver/fac8493d09ef7e0519ff3cae20b267d2 Hopefully that clarifies the advantages/disadvantages and pain points, at least.

genaray / Arch

Dependency parallelization #151

Background

Idea

State