Open Datseris opened 3 years ago
Sounds like a decent plan. If it works we can make a macro to do the step expansion, so worth it: probably. Depending on how much speedup we see. But it shouldn't take too much effort at least.
I benchmarked this, using the following setup: https://gist.github.com/AayushSabharwal/f5870d837635a4510c41e57087b31a64
julia> include("src\\predator-prey-check.jl")
julia> m = initialize_model()
AgentBasedModel with 550 agents of type Union{Grass, Sheep, Wolf}
space: GridSpace with size (20, 20), metric=chebyshev and periodic=false
scheduler: by_union
julia> sheep = [a for a in allagents(m) if typeof(a) == Sheep];
julia> wolves = [a for a in allagents(m) if typeof(a) == Wolf];
julia> grass = [a for a in allagents(m) if typeof(a) == Grass];
julia> function customstep(s, w, g, model)
for x in s; agent_step!(x, model); end
for x in w; agent_step!(x, model); end
for x in g; agent_step!(x, model); end
end
customstep (generic function with 1 method)
julia> @btime customstep($sheep, $wolves, $grass, $m)
87.799 μs (1351 allocations: 120.19 KiB)
julia> function ourstep(model)
for a in Agents.schedule(model)
agent_step!(model[a], model)
end
end
ourstep (generic function with 1 method)
julia> @btime ourstep($m)
195.000 μs (2553 allocations: 189.20 KiB)
ourstep
takes a little over double the time
Good to know how this scales with time as well, but need to be a bit careful with wolf-sheep, since it has the possibility to collapse a population a certain percentage of the time and therefore run much quicker—skewing the results.
Here's the same setup as above, but using this benchmark call.
using Test
m = initialize_model()
sheep = [a for a in allagents(m) if typeof(a) == Sheep];
wolves = [a for a in allagents(m) if typeof(a) == Wolf];
grass = [a for a in allagents(m) if typeof(a) == Grass];
function customstep(s, w, g, model)
for t in 1:50
for x in s; agent_step!(x, model); end
for x in w; agent_step!(x, model); end
for x in g; agent_step!(x, model); end
end
end
function ourstep(model)
for t in 1:50
for a in Agents.schedule(model)
agent_step!(model[a], model)
end
end
end
n = deepcopy(m)
@btime customstep($sheep, $wolves, $grass, $m) teardown = (@test count(i -> i isa Sheep, allagents(m)) > 0 &&
count(i -> i isa Wolf, allagents(m)) > 0)
# 4.812 ms (60435 allocations: 5.34 MiB)
@btime ourstep($n) teardown = (@test count(i -> i isa Sheep, allagents(n)) > 0 &&
count(i -> i isa Wolf, allagents(n)) > 0)
# 11.928 ms (120595 allocations: 8.71 MiB)
So yes, looks like it'll be good to write a macro for this.
Why a macro? We can do this via standard dispatch. The default scheduler changes if the type AgentType
is a Union, and step!
can also change accordingly to do what customstep
does.
Importantly, we should do this for the paper, as this model is one we are not much more performant than the competitors
Ah, so you suggest making by_type
the default if we're a mixed model, then use that order. I guess what I'm not following there is for the moment we do the ordering and vcat
the result. Calling the step function then just gives is this list of agents via the scheduler. Even if this list is ordered, using the standard step doesn't work. We need to use the customstep layout.
Is the suggestion here to not do the vcat
, and instead return a list of lists over each agent type, then splat that into our step function? My suggestion of the macro was just since we don't know for sure how many separate agent types we will have.
Paper: yeah, that already crossed my mind.
We can't be using by_type
as a vector of mixed agents is still type unstable. We must separate the list to ids per type. However, we can still use by_type
.
My suggestion is to use multiple dispatch to make step!
behave like customstep
. This is possible. Then, by_type
is not called explicitly, but instead agent ids are grouped to their type and used in customstep
. You can do multiple dispatch based on function types.
E.g. f(x, y) = y(x)
and f(x, ::typeof(rand)) = rand()
.
Last year I did some benchmarking to investigate the performance cost of mixed types. The benchmark varied the number of types up to 15 while holding constant the number of interactions. Something along those lines might be useful for understanding how performance scales with the number of agent types. Another factor that could be important is the time required for each interaction. My best guess is that the performance hit becomes less noticeable as the time required to compute an interaction increases.
Also, I came across TypeSortedCollections some time ago, which seems to automate the process of grouping mixed collections by type.
Looks interesting! Will certainly take a deeper look.
Seems like this new Catwalk.jl is relevant for us here: https://discourse.julialang.org/t/ann-catwalk-jl-with-dynamic-dispatch-to-the-moon-an-adaptive-optimizer-aka-jit-compiler/57917/1
That looks pretty awesome.
@Libbum would you please be so kind and tell me in your comment https://github.com/JuliaDynamics/Agents.jl/issues/445#issuecomment-796578504 what is initialize_model()
?
I don't know how to dispatch on union types: https://discourse.julialang.org/t/dispatch-on-precice-union-instance-in-place-of-subtyping-supertype-of-union/58117
what is initialize_model()?
It's the one from Aayush's gist: https://gist.github.com/AayushSabharwal/f5870d837635a4510c41e57087b31a64
don't know how to dispatch on union types
This is why I was thinking a macro may be a solution here, we can use our union_types
method to assist.
Our union_types
is not type stable:
Even though I've implemented my suggestion here, it doesn't improve performance because I can't get the individual agent types in a type-stable manner.
I am thinking how deep we should go here. I even though that we should have a tuple of dictionaries where each dictionary has only agents of one type... This goes too far though, as it would take tremendous amount of extra source code, simply because of the scheduling... Even the most basic schedulers like random activation would have to be separated into generating vectors of vectors, etc. The scheduler API would become much more complex for multi-agent modes, etc...
If you see #472 I simply suggest to not use multi-agent versions for performance critical code. I stand that it is probably to our benefit to keep the current status quo, if my branch multiagent
does not lead anywhere. We're already faster than anyone else, and also simplest than anyonelese. Sacrificing simplicity to make us even faster might not be the best idea.
I'm confused about this point. There's a working solution above, are you saying it's impossible to generalise?
I don't mind adding a discussion to the docs either way, but struggle to think why we can't get the custom_step
working.
There's a working solution above, are you saying it's impossible to generalise?
Yeah it's working because we you can do:
sheep = [a for a in allagents(m) if typeof(a) == Sheep];
wolves = [a for a in allagents(m) if typeof(a) == Wolf];
grass = [a for a in allagents(m) if typeof(a) == Grass];
and you write explicitly the types. Getting these types form the Union in a type-stable manner is something I have NOT been able to do yet.
What we can do is alter the code we have now and add a new entry to the model struct that is a tuple of the agent types. Then this type-unstable operation is done once at model creation and does not propagate further down the line.
My two discourse posts based on my work so far on this:
https://discourse.julialang.org/t/iterating-through-types-of-a-union-in-a-type-stable-manner/58285/2
Seems like our current approach of using Unions is not good in general. However it plays well with the schedulers and the fact that we map IDs to Agents.
Geez. Rough!
What we can do is alter the code we have now and add a new entry to the model struct that is a tuple of the agent types. Then this type-unstable operation is done once at model creation and does not propagate further down the line.
That sounds like a good way forward. I'm really not sure how we'd get around the Union approach right now.
I'm really not sure how we'd get around the Union approach right now.
I'm genuinely not sure how much effort we should be spending into it... In the sense that for really complex models, the user might go the only having model_step!
route. There, if they themselves do not make proper scheduling, then all of our effort will be wasted.
Perhaps the best thing here is to document how to do it properly without internal magic, and showcase this with the wolf-sheep model.
This is an extremely difficult issue. We have to spend a lot of effort and come up with a fundamentally different internal design to really allow performant multi-agent types the way we do it now.
I'll wrap up the documentation indications for this and Pull Request them. Then we should open a different issue discussing design. Quite large project, I don't feel I can do it at the moment.
BIG NEWS HERE: https://discourse.julialang.org/t/ann-virtual-jl-optimizing-dynamic-calls-away-through-virtual-multiple-dispatch/84413
I think this package is probably the way to go to solve our problem!!! We need to test it!
there is a another problem which is much more impacting in terms of performance related to all of this: with a union of types all operations which require to look at an agent inside the model will be type-unstable: consider for example that you do a nearby search for an agent and then you iterate over those agents: the loop will be type unstable since you do model[id]
which is type unstable. As another example also collect(nearby_agents(...))
will be problematic. This is why both https://github.com/JuliaDynamics/Agents.jl/blob/main/test/performance/branching_faster_than_dispatch.jl and https://github.com/JuliaDynamics/Agents.jl/blob/main/test/performance/variable_agent_types_simple_dynamics.jl are actually slow in the multitype version. The dynamic dispatch on the step! function has a negligible effect in comparison.
This is why we should be really try to implement #841. This won't solve all problems but with some attention from the user side it will be possible to have (almost) type stable operations.
EDIT: See https://github.com/JuliaDynamics/Agents.jl/issues/445#issuecomment-1187730099 for a potential solution!!!
In Julia when iterating over a non-concretely typed container (e.g. Dict{Int, Union}) the iterables are not type stable. If they are given to a function (e.g.
agent_step!
) Julia does dynamic dispatch instead of static, which is a big hit for performance.We should see if our simple Wolf/Sheep/Grass model does dynamic or static dispatch. Set up the model in a way that no agents will be created or killed, but only moved and change values. Be sure that no structs/arrays are created during model evolution.
Then, benchmark the following:
and compare it with:
and check the allocation and time differences between the two. Do some schedulers (e.g. the scheduling by Type) improve this?
Should we re-work the internals of
step!
for multi-agent models so that it explicitly separates agents into types first? Is it worth it? Otherwise, we can simply put a note on the "Performance tips" section of the docs of users writing their own custom model stepping function that collects each agent type first and then steps each one individually.(this Performance section is still to be written and will also have stuff like immutable types, and using a single agent type for maximal performance)