JuliaDynamics / Agents.jl

Agent-based modeling framework in Julia
https://juliadynamics.github.io/Agents.jl/stable/
MIT License
717 stars 115 forks source link

Design discussion for an Event-Based (Gillespie-like) agent based model in Agents.jl #902

Closed Datseris closed 7 months ago

Datseris commented 10 months ago

The new AgentBasedModel API of the upcoming v6 allows us to make new model types where the "evolution rule" can be fundamentally different. This is one of them. This issue discusses a possible solution to #759 .

Event-Queue Agent Based Model design

Introduction

Goal of this is to provide an alternative type of ABM simulation. It operates in continuous time, versus the currently available StandardABM which operates in discrete time. This is an "event queue" simulation, also known as "Gillespie" simulation or "discrete event simulation". There are Julia packages to perform such simulations (that are actually actively maintained), for example:

However, for these two the spatial aspect is either weak or non-existent, while for ABMs spatial dynamics are crucial.

I've spent considerable amount thinking whether it would possible to interface JumpProcesses.jl to make the current proposal work with it as a backend. But it didn't seem straight forward, or perhaps it would require writing a lot of code. But maybe I am wrong, see my internals section at the bottom and let me know...

How things will work in the new model type

Events are scheduled at some particular time, and once the model time reaches the event time, the event is executed. Events are arbitrary agent-stepping functions. They are very similar to the agent_step! function used currently in the standard ABM implementation. Hence, events have access to the full API of Agents.jl, including functions like nearby_agents, move_agent, add_agent!, replicate!, .... Whatever! This is incredibly powerful when it comes to performing spatial simulations with event based models.

Events that are occurring are chosen randomly during scheduling, based on the principle of "propensities". Each event has a possible propensity value and the probability for choosing a particular event is proportional to the propensity. This probability is calculated automatically by an internal scheduler.

User input

Besides a space instance, for each agent type a user provides:

Note that unlike StandardABM, we can't have here the user providing the same input irrespectively of agent type and using multiple dispatch. That is because different agent types may have different number of possible actions.

Scheduling

We first have the Event type, which is

struct Event
    id::Int # id of the agent to be activated
    event_index::Int # index of the tuple of possible actions to act on the agent
end

(note: this has the same type instability problems that StandardABM has)

this event is stored in a PriorityQueue{Event, Float64}().

During the model time evolution we go through the events in the Queue, pop! them from the queue and "activate" them. When an event is activated, two things occur:

First, the relevant action is operated in the agent. This is done by extracting the agent corresponding to the event ID, and the action corresponding to the combination of agent type and event index.

Then, the scheduler is called upon the agent to generate a new event. The scheduler utilizes the rates corresponding to the agent type, generates the event, and adds it to the queue.

Time evolution

time evolution is identical to StandardABM. step!(model, time) is used, with the only difference here that time is Float64 instead of integer and steps the model until the current model time is equal or greater to the initial time plus time.

Time evolution skips events mapping to agents that have been removed from the model as is done generically already.

All other functions like run!, ensemblerun!, etc. work out of the box.

Initial scheduling?

Okay we know how to add events in the que, but how does the que start in the first place? An empty que cannot generate events. So there needs to be some kind of way to initialize the queue?

Minimal Working Example

Here is a full minimal working example of a spatial rock-paper-scissors example mentioned in https://github.com/JuliaDynamics/Agents.jl/issues/760 and in https://github.com/JuliaDynamics/AgentsExampleZoo.jl/pull/17 . On purpose I am making rock, paper, and scissor, three different types, to show how things would work for different type. (in this MWE the actual possible events are the same, but they don't have to be)

Naturally, the example isn't actually "working", as I haven't written the source code proposed here. But this is how it would look like!

using Agents

# define the three agent types
@agent struct Rock(GridAgent{2})
end
@agent struct Paper(GridAgent{2})
end
@agent struct Scissors(GridAgent{2})
end

# Define the possible events
function selection!(agent, model)
    contender = random_nearby_agent(agent, model)
    if !isnothing(contender)
        pos_contender = contender.pos
        if agent isa Rock && contended isa Scissors
            kill_agent!(contender, model)
        elseif agent isa Scissors && contender isa Paper
            kill_agent!(contender, model)
        elseif agent isa Paper && contender isa Rock
            kill_agent!(contender, model)
        end
    end
    return
end

function reproduce!(agent, model)
    pos = random_nearby_position(agent, model, 1, pos -> isempty(pos, model))
    isnothing(pos) && return
    add_agent!(pos, typeof(agent), model)
    return
end

function swap!(agent, model)
    rand_pos = random_nearby_position(agent.pos, model)
    if isempty(rand_pos, model)
        move_agent!(agent, rand_pos, model)
    else
        near = model[id_in_position[rand_pos]]
        swap_agents!(agent, near, model)
    end
    return
end

# Put agents and events in the format required
agent_types = (Rock, Paper, Scissors) # internally this will be made `Union`
# events
rock_events = (selection!, reproduce!, swap!)
scissor_events = (selection!, reproduce!, swap!)
# paper dynamics have no movement
paper_events = (selection!, reproduce!)
# same layout as agent types
allevents = (rock_events, scissor_events, paper_events)

# rates:
# the length of each of these needs to be the same as the length of event tuples
rock_rates = (0.5, 0.5, 0.2) # relative proportionality matters only
paper_rates = (0.5, 0.2, 0.1)
scissor_rates = (0.5, model -> count(a -> isa Rock, allagents(model)))
# same layout as agent types
allrates = (rock_rates, paper_rates, scissor_rates)

# Create model
space = GridSpaceSingle((100, 100))

abm = EventQueueABM(space, agent_types, allevents, allrates)

step!(abm, 1.32)

Internals

The step! implementation is straight-forward:

function step!(model::EventQueueABM, t::Real)
    t0 = zero(t)
    while t0 < t
        event, dt = dequeue_pair!(model.event_queue)
        process_event!(event, model)
        t0 += dt
    end
end

function process_event!(event, model)
    (; id, event_idex) = event
    agent = model[id]
    agent_type = findfirst(isequal(typeof(agent)), model.alltypes)
    event_function! = model.allevents[agent_type][event_index]
    event_function!(event, model)
    # evaluate propensities:
    propensities = model.allrates[agent_type] .* nagents(model)
    total_propensity = sum(propensities)
    τ = rand(abmrng(model), Exp(total_propensity))
    new_event = select_event_based_on_propensities(propensities, model)
    enqueue!(model.event_queue, new_event => τ)
    return
end
Datseris commented 10 months ago

cc @Tortar @MA-Ramirez @jacobusmmsmit as well.

Datseris commented 10 months ago

I forgot to say that wit this implementation the entirety of Agents.jl API works out of the box with the model. This includes ensemblerun!, sample!, the ABM plotting stuff, etc.

Datseris commented 10 months ago

The only thing that does not work out of the box is the AgentsIO module, because it has been written in a rather "hardcoded"-y way instead of using API functions.

ChrisRackauckas commented 10 months ago

The integrator interface with JumpProcesses.jl I think handles most of this. Could it be reused here? I think you'll want to reuse it since a lot of things go beyond Gillespie.

A tuple of rates. If the rate is a real number, the propensity is the rate times nagents(model). If the rate is a function, the propensity is f(model) times nagents(model).

What about mass action kinetics where the reaction is a basic interaction between two agents? You'll want to have that case specialized for performance.

And note that when you have interactions, being able to get that interaction graph of what reactions involve what agents can be essential to scaling. So when you go to the function case, the ability for the user to specify that underlying information, be it manually or via a macro, is required in order to use the O(1) scaling methods.

Datseris commented 10 months ago

I am also hoping to use JumpProcesses.jl, but it appears that it needs to place additional restrictions on the simulation (which allows it to gain performance and to perform many different types of sumulation with the same integrator). Let's see...

basic interaction between two agents?

What is a basic interaction between agents? What defines "basic" in this context? And when you say agents here, do you mean agent species (ie., types), or really individual agents?

being able to get that interaction graph of what reactions involve what agents can be essential to scaling

EDIT: (originally I misunderstood the point). Yes, this sounds good and it shouldn't be to hard to provide a user with a way of providing this information.

There could be cases that the interaction graph is fully connexted though. E.g., some events could just look at nearby agents and do something irrespectively of the type of the neighboring agent. In these cases I would guess one needs the interaction graph to be fully connected.

Datseris commented 10 months ago

I've thought of a way to generalize scheduling. The Gillespie scheduler type schedules agents with this "propensitites" approach. The other approach would be to have another scheduler, that just takes in a function that given agent, model it returns the new event and the time it should occur at. It allows more flexibility in the loss of structure. Some users like the structure.

jacobusmmsmit commented 10 months ago

I like this concept, I'm sorry if this isn't super polished but I wanted to add something now that I have some time. If I have understood correctly, however, I believe that the performance of many simulations could be improved by scheduling many events far into the future.

The best[1] way to model an M/M/1 queue, which is an example of a discrete event + continuous time model, is to predetermine as much randomness as possible at once. In this case, that covers the whole model. This is possible to do in the MM1 case, and I believe in many cases such as the one you described here, because the rate/propensity model means the probability distribution of time between events is just $Exp(\lambda)$ where $\lambda$ is the rate.

Currently, your design says that agents add new events to the queue whenever they are "popped" themselves, this is fine but seems not so performant, why not just predefine all of their time-deltas between acting at the beginning of the simulation up to the time specified by the user in run! or step!?

If all agents, and possibly the environment/model, adds their events to the queue, then the simulation is pretty much predetermined at specification time. This might not be possible when rates change, or if actions are no longer valid due to movements in a spatial dimension, but it seems like for many cases this might be a better way to use the prioQ data structure.

Either way, good stuff. I can dig into this more on Tuesday. I'll look at the internals of other packages mentioned to see how they work. Please do ping me if you think I could help with anything.

[1]: my source is my experience, I've not done a deep dive.

Datseris commented 10 months ago

Thanks a lot for the feedback. It's important to collect views on this. At the moment, where I need the most help is understanding whether JumpProcesses.jl can be used under the hood.

Currently, your design says that agents add new events to the queue whenever they are "popped" themselves, this is fine but seems not so performant, why not just predefine all of their time-deltas between acting at the beginning of the simulation up to the time specified by the user in run! or step!?

The new events depend on the state of the model. And events change this state. Think of the agent_step! functions that are currently in the examples of Agents.jl. Most of them depend on model state: how many agents are nearby our agent, what is the accummulated model property, how many agents to remove, etc. That is why it is simply not possible to "schedule into the future" without performing the prior actions. New events can only be scheduled after the previous events are executed, for the simple reason that their nature depends fundamentally on what the previous events have done. Look at the simple example of the rock paper scissors above. The rates depend on the relative populations of rocks and scissors. How can I foresee the rates of the future, when the events themselves change the rates?

Maybe I do not understand your proposal, or I am missing the necessary background in these kind of simulations. Perhaps you can elaborate using e.g., the concrete example of rock paper scissors.

But most importantly: how does it matter at all for performance if you schedule the queue at the start and go through it, rather than scheduling it step by step? I don't see this. Adding events to the queue, and popping them out, takes a miniscule amount of time when compared to actually performing the action of the agent stepping function (i.e., the action of the event). just searching for nearby agents, a small component of an event, will take much more time than adding or removing to a queue. So what's the pre-scheduling going to bring to the table in terms of performance?

If all agents, and possibly the environment/model, adds their events to the queue, then the simulation is pretty much predetermined at specification time.

It is "predetermined" only in the sense that the rules yield the actions. But you can't foresee the actions. You are defining the rules of the simulation, but in complex systems, you almost never know what the simulation will actually do once it runs. That is the fundamental underlying principle of emergent dynamics: you can only see what they do once you simulate them. Think about the Lorenz butterfly. You can't deduce it from the Lorenz equations. But you can only specify the Lorenz equations at the start.

jacobusmmsmit commented 10 months ago

Sorry for making you write out such a long reply to my fundamental misunderstanding. Design seems good. Will look into JumpProcesses.jl.

Krastanov commented 9 months ago

I will attempt to provide a bit more context on ConcurrentSim.jl (formerly SimJulia.jl). It indeed takes care of continuous-time discrete event simulations and is quite performant. It fundamentally is based on a priority queue of events, where events themselves can trigger callbacks. I find it to be much more natural as a tool to write discrete event simulations compared to JumProcesses, but that might be due to my previous exposure to Netsquid and SimPy.

One technical question that would be important to consider early on: the entirety of the library builds upon ResumableFunctions.jl, a "co-routines in julia" library that is small, simple, well tested, and has very minimal dependencies (great!), but is also very opinionated and contains some unholy macros and dependencies on julia internals (cool, but scary). Is this dependency on co-routines capabilities something you would be happy with?

On the logistics side: The Belgium Military Academy seems to have a lot of projects that depend on ConcurrentSim so the project will probably remain supported by a couple of volunteers in the future. The NSF ERC Center for Quantum Networks also relies on that library (which is where my involvement comes in).

Happy to discuss the more technical details if that would be of use, just let me know where to start.

Datseris commented 9 months ago

... and contains some unholy macros and dependencies on julia internals (cool, but scary). Is this dependency on co-routines capabilities something you would be happy with?

Great, thanks for pointing this out. I would say that is a blocker for Agents.jl, which favours simplicity above everything else. Given that the typical Agents.jl user is not expected to be expert in programming concepts, I would like to avoid non-basic Julia syntax as much as possible in Agents.jl. But thanks a lot for this great summary, and good to know of upcoming development investment!

jacobusmmsmit commented 9 months ago

Perhaps we could use ConcurrentSim without exposing anything scary to the end users of Agents?

Krastanov commented 9 months ago

Perhaps we could use ConcurrentSim without exposing anything scary to the end users of Agents?

I think that is pretty reasonable, doable, and potentially even easy. I will post a few example snippets of code here shortly.

Datseris commented 9 months ago

It will be easier if you can tell us where there could be usage in the example code I've posted above.

00krishna commented 9 months ago

I think this would be a really great inclusion for Agents.jl. Sounds like the development team has been doing some good planning. I think that the starting point is pretty simple, like allowing users to use continuous or discrete time, to allow random and biased-random walks, to allow different noise types. So starting with something like that sounds good.

Would users be able to use something like Catalyst.jl through this interface? I have thought about that in the past. Like technically I could already incorporate Catalyst.jl into a simulation to model chemical reactions, but it might be pretty slow because the Catalyst simulation code is not integrated/optimized for Agents.jl. But the use of Catalyst.jl might be too far ahead of the development team's plans.

Datseris commented 9 months ago

Catalyst.jl is specific in scope to be used as a backend here. In fact, Catalyst.jl is not a backend in general. It is a front end for chemical simulations. JumpProcesses.jl would be the backend.

Given that the majority of simulation time taken will clearly be in executing the events, I am not sure what a backend will bring to the table, since the event scheduling is not a performance bottleneck. The internals of the step! function that I've outlined in my first post is 10 lines of code but I am not sure how I would modify this function to use either ConcurrentSim.jl or JumpProcesses.jl as a backend.

Soon I will have a draft implementation for this as a PR with the Rock Paper Scissors as a documentation example (without any backend; hopefully in the PR we can test different backends).

Tortar commented 9 months ago

I looked a bit into JumpProcesses.jl, it seems to be that we should pass to the integratorall the model and so I tried a little experiment modifying the first example given in the docs:

using JumpProcesses

struct model
    u::Base.RefValue{Int}
end

rate(u, p, t) = p.λ
affect!(integrator) = (integrator.u[1][] += 1)
crj = ConstantRateJump(rate, affect!)

u₀ = [model(Ref(0)),]
p = (λ = 2.0,)
tspan = (0.0, 10.0)

dprob = DiscreteProblem(u₀, tspan, p)
jprob = JumpProblem(dprob, Direct(), crj)

sol = solve(jprob, SSAStepper())

but in the last line it throws a long stacktrace which culminates in ERROR: MethodError: no method matching getindex(::model).

edit: stupid me, I should have used integrator.u[1].u[]. So first try went well :D

Datseris commented 9 months ago

There need to be many different affect! functions for the many different "events" that may occur at agents. Additionally, there could be different affect functions for different agent types. How does one handle this ?

Can you simply do the Rock Paper Scissors example above? I consider this example the minimal example one would use this functionality for. That is, it is the least possible agent-based gillespie m,odel that one would actually use in practice.

Tortar commented 9 months ago

There need to be many different affect! functions for the many different "events" that may occur at agents. Additionally, there could be different affect functions for different agent types. How does one handle this ?

Indeed this seems difficult to achieve but even more fundamentally I find it impossible to replicate what you did with the priority queue in JumpProcesses since what you schedule in the queue is a pair of an agent + an event. I don't find a way to tie the two together with the interface in JumpProcesses, even if the affect! function can update the state of the model it doesn't have any information to apply an event to any specific agent.

Datseris commented 9 months ago

I have thought of a design of how to tackle this.

This new Event-based ABM can also have "schedulers". These are the types that indicate how the events are scheduled in the que and then trigerred. The "Gillespie" scheduler I describe above seems to be the more flexible one and should be the go-to. Having a type for this allows us to in the future add more types, e.g., on based on Jump Processes, that may yield more performance in exchange for more restricted design (ie only one type of agent). This way we don't need to block this feature, which is the main holder of v6. We can just implement the code I outlined above in my MWE and "call it a day" so to speak. As in the future, one could add more specialized, and hence more performant, schedulers.

Tortar commented 9 months ago

I was about to propose this too, I think this is the correct way to proceed 👍

gregid commented 9 months ago

Is there any branch/fork that this is actively worked on?

Datseris commented 9 months ago

Not yet, @Tortar wanted to start a branch but hasn't happened yet. Do you want to do it instead?

gregid commented 9 months ago

Thanks @Datseris but (just yet) I know too little of Agents.jl architecture to not mess it up - in fact I've just learned there is an ongoing v6 development. But I will gladly join the effort once set up.

Krastanov commented 8 months ago

I have some time today to implement an example using ConcurrentSim. From my reading of the mock code above, it should be trival (maybe 10 lines of code), but there are Agents-dependent pieces missing from the example code. There are a bunch of methods that are supposed to work on the mock EventQueueABM that are missing, e.g. nagents, getindex. Where can I find these?

Tortar commented 8 months ago

@Krastanov you can probably look at my draft PR https://github.com/JuliaDynamics/Agents.jl/pull/917 for the missing methods. There, the rock paper scissors example already works. Also, some methods like nagents are given for free if you conform to the ABM interface (for this reason, they already work on that PR)