Open ptheywood opened 4 years ago
Some current thoughts on how dependency analysis might work, in the context of a model implementation.
This only covers an example model with one agent type in one state, so will need expanding to cover multiple agent types, and/or multiple states, but the general idea will likely remain the same.
The example is a modified version of the circles model, splitting the move function into 2 parts to make the dependencies a bit more interesting, and adding a host function.:
circle
location
circle_output_location
circle_input_location
circle_move
In practice, the execution layers should be:
circle_output_location
circle_input_location
circle_move
host_function
This assumes host_function could run in parallel with circle_move
. In practice this is likely to require it's own layer.
The DAG would be
circle_output_location
|
v
circle_input_location
| |
v v
circle_move host_function
The implementation would aproximately be:
// Agent functions
FLAMEGPU_AGENT_FUNCTION(circle_output_location, FLAMEGPU::message::None, FLAMEGPU::message::BruteForce) { /* ... */ }
FLAMEGPU_AGENT_FUNCTION(circle_input_location, FLAMEGPU::message::BruteForce, FLAMEGPU::message::None) { /* ... */ }
FLAMEGPU_AGENT_FUNCTION(circle_move, FLAMEGPU::message::None, FLAMEGPU::message::None) { /* ... */ }
// host functions.
FLAMEGPU_HOST_FUNCTION(host_function) { /* ... */ }
int main(int argc, char*argv[]){
// New model description
FLAMEGPU::model::ModelDescription model("circles");
// Define the environment which is common to the model, with the default values.
EnvironmentDescription &environment = model.Environment();
environment.newVariable<int>("seed", 0);
environment.newVariable<int>("width", 1024);
// Declare agent types
FLAMEGPU::model::AgentDescription &circleAgent = model.newAgent("circle");
// Id is automatic.
circleAgent.newVariable<float>("x");
// ...
circleAgent.newState("default");
// ...
// Declare messages
FLAMEGPU::model::MessageDescription &locationMessage = model.newMessage("location");
locationMessage.newVariable<float>("id"); // id is not automatic?
locationMessage.newVariable<float>("x");
// ...
// Add output_location function to the circle agent.
FLAMEGPU::model::AgentFunctionDescription fn_circle_output = circleAgent.newFunction("output_location", circle_output_location);
// State which message it produces as output - an implicit dependency
fn_circle_output.setMessageOutput(locationMessage);
// input_location
FLAMEGPU::model::AgentFunctionDescription fn_circle_input = circleAgent.newFunction("input_location", circle_input_location);
// State which message it takes as input - an implicit dependency
fn_circle_input.setMessageInput(locationMessage);
// Add the circle movement function
FLAMEGPU::model::AgentFunctionDescription fn_circle_move = circleAgent.newFunction("move", circle_move);
// Host functions, which exist within layers
FLAMEGPU::model::HostFunctionDescription fn_host_function = model.newHostFunction(host_function);
// Dependencies can then be specified
circle_output_location.dependsOn(NULL);
circle_input_location.dependsOn(circle_output_location);
circle_move.dependsOn(circle_input_location);
host_function.dependsOn(circle_input_location);
// ...
}
The DAG would be validated (checked for loops) and flattened into layers (include some form of topological sorting algorithm, but with branching. Coffman Graham?) when the model if made concrete.
It would likely be worth adding an interface which is implemented by the various Description
classes, to provide the dependency support.
Several dependsOn()
prototypes could exist:
dependsOn(NULL)
taking an enum/NULL
to indicate there is no dependency (i.e layer 0); dependsOn(agentFunction)
taking a something which implements dependable
such as an agent functiondependsOn(std::vector<...>)
A vector of dependables
Internally to the ModelDescription
, adding message inputs/outputs / would internally add to the draft dependency graph. This is not going to be enough information in all cases. I.e. where multiple agent functions output to the same message list - what order do they go in?
The same would apply for states etc.
Example of dependencies for the PedestrianNavigation example of FLAME GPU 1.
<layers>
<layer>
<gpu:layerFunction>
<name>generate_pedestrians</name>
</gpu:layerFunction>
</layer>
<layer>
<gpu:layerFunction>
<name>output_pedestrian_location</name>
</gpu:layerFunction>
<gpu:layerFunction>
<name>output_navmap_cells</name>
</gpu:layerFunction>
</layer>
<layer>
<gpu:layerFunction>
<name>avoid_pedestrians</name>
</gpu:layerFunction>
</layer>
<layer>
<gpu:layerFunction>
<name>force_flow</name>
</gpu:layerFunction>
</layer>
<layer>
<gpu:layerFunction>
<name>move</name>
</gpu:layerFunction>
</layer>
</layers>
ModelDescription model = // ...
// Agents
AgentDescription a_pedestrian = // ...
AgentDescription a_navmap = // ...
// message lists
MessageDescription m_pedestrian_location = // ...
MessageDescription m_navmap_cell = // ...
// Agent functions without any dependencies specified
AgentFunctionDescription f_generate_pedestrians = a_navmap.newFunction(/* ... */);
AgentFunctionDescription f_output_navmap_cells = a_navmap.newFunction(/* ... */);
AgentFunctionDescription f_output_pedestrian_location = a_pedestrian.newFunction(/* ... */);
AgentFunctionDescription f_avoid_pedestrians = a_pedestrian.newFunction(/* ... */);
AgentFunctionDescription f_force_flow = a_pedestrian.newFunction(/* ... */);
AgentFunctionDescription f_move = a_pedestrian.newFunction(/* ... */);
// Specify message output/input (implicit dependencies)
f_output_navmap_cells.setMessageOutput(m_navmap_cell);
f_output_pedestrian_location.setMessageOutput(m_pedestrian_location);
f_avoid_pedestrians.setMessageInput(m_pedestrian_location);
f_force_flow.setMessageInput(m_navmap_cell);
// Specify agent output (implicit dependency / concurrency blocker)
f_generate_pedestrians.setAgentOutput(a_pedestrian);
// Set agent death (blocks concurrency)
f_force_flow.setAllowAgentDeath(true);
Following on from the implicit dependencies which are already defined, explicit dependencies are required to instanciate the model, as it is unclear on the order of execution otherwise.
The simplest approach is to require atleast one explicit dependence for each function - making the DAG->layers algorithm simple:
f_generate_pedestrians.dependsOn(); // Identify this is L0
f_output_pedestrian_location.dependsOn(f_generate_pedestrians);
f_output_navmap_cells.dependsOn(f_generate_pedestrians);
f_avoid_pedestrians.dependsOn(f_output_pedestrian_location);
f_force_flow.dependsOn(f_avoid_pedestrians)
f_move.dependsOn(f_force_flow);
Alternatively, a more complex layering algorithm should be able to deal with collisions within a wider DAG, such that it will generate the fewest layers possible where collisions occur.
I.e. output_navmap_cells
does not actually depend on generate_pedestrians
, rather it could be concurrent if it were not for the concurrency blocker of same agent same state.
This may be very challenging in practice, as what if the user provided output_navmap_cells
function mutates an agent variable used within generate_pedestrians
. This would require either extra specification of what the function will modify, to allow the correct order to be calculated.
f_generate_pedestrians.dependsOn(); // Identify this is L0
f_output_pedestrian_location.dependsOn(f_generate_pedestrians);
f_output_navmap_cells.dependsOn();
f_avoid_pedestrians.dependsOn(f_output_pedestrian_location);
f_force_flow.dependsOn(f_avoid_pedestrians)
f_move.dependsOn(f_force_flow);
It may also be possible to infer the order of functions based on message dependencies.
I.e. f_avoid_pedestrians
depends on the m_pedestrian_location
, so it could be inferred that this must be executed after f_output_pedestrian_location
.
And f_force_flow
depends on m_navmap_cell
, so it must be called after f_output_navmap_cells
, however these are both from teh same agent in the same state so they cannot execute in the same layer, but which order is correct depends on their implementations, hence the explicit f_force_flow.dependsOn(f_avoid_pedestrians)
It should be possible to remove the explicit f_avoid_pedestrians.dependsOn(f_output_pedestrian_location);
dependence and still construct layers in this case, however if multiple agent functions output to a message list implicit ordering would be much more ambiguous.
For this model, it should be possible to reconstruct the layers used with just 3 explicit function dependencies, alongside the implicit layer dependencies, but more complex models would likely need many more dependencies explicitly stating.
If this is looked at from a modelling perspective, rather than the layer-based perspective it might be possible to make this specification simpler.
In this model, the move
method is the final step of the iteration.
It relies on the pedestrian
agent being aware of the forces from other pedestrians, and from the environment: f_avoid_pedestrians
and f_force_flow
(assuming these are order-independent).
f_avoid_pedestrians
and f_force_flow
have implicit message input dependencies on m_pedestrian_location
and m_navmap_cell
respectively.
m_pedestrian_location
and m_navmap_cell
have implicit message output dependencies on f_output_pedestrian_location
and f_output_navmap_cells
respectively.
Lastly output_pedestrian_location
must have an explicit dependence on
generate_pedestrians
(potentially implicit, but order is unclear).
This would produce the dependency graph (from the bottom up):
n.generate_pedestrians()
|
v
p.output_pedestrian_location() n.output_navmap_cells()
| |
| v
<m_pedestrian_location> <m_navmap_cell>
| |
| v
p.avoid_pedestrians() p.force_flow()
| |
v v
p.f_move()
and only require 2 (3?) explicit dependencies.
f_move.dependsOn({f_avoid_pedestrians, f_force_flow});
f_output_pedestrian_location.dependsOn(f_generate_pedestrians);
Flattening this to layers would only require deciding which method of f_avoid_pedestrians and force_flow is executed first. If they were known to be independent (i.e neither mutates an agent member variable known by the other) they could even be executed concurrently or fused into a single kernel launch.
The contents of a message lists persists from one iteration of the model, to the first agent function which outputs to that message list in the subsequent iteration.
This reduces the usefulness of message list dependencies to determine order of execution (other than no output and input to the same message list in a single layer), as even in a simple case with just an input
and a output
functions, both output -> input
and input -> output
would be valid, just with differnt behaviour.
I am still catching up on reading but started thinking about Domain Specific Language (DSL) based on Model Driven Architecture (MDA) that would provide dependency analysis at the higher level model rather than doing it in CUDA code.
So in case its relevant here I will just link to the stuff I am still studying, without notes explaining.
https://github.com/thecapitalistcycle/CSXMS/blob/master/FGPU_DSL_links.md
Rather than users having to describe models using explicit layers, using dependencies instead would allow the FLAME GPU library to determine the layers of execution, potentially improving performance through improved concurrency and reduced synchronisation points.
A directed acyclic graph (DAG) of the dependency relations should would need to be constructed from the user-provided dependencies between functions (via a new api).
An algorithm would then have to be applied to the DAG, to flatten the DAG into execution layers and synchronisation points. Alternatively this might map nicely to cuda graphs, although currently multiple graphs would be needed and this might not actually improve performance.
An initial implementation would include a dependency graph which is flattend to a set of layers for execution. Longer term a dependency graph could be concretised into an execution graph, which is used within
CUDAAgentModel
rather than layers, although this would likely still need to form layers for concurrency.