JuliaDynamics / Agents.jl

Agent-based modeling framework in Julia
https://juliadynamics.github.io/Agents.jl/stable/
MIT License
731 stars 118 forks source link

Multiple, successively executed `agent_step!` functions during each step #331

Closed fbanning closed 3 years ago

fbanning commented 3 years ago

How could one add multiple agent_step! functions for each step, so that the order of execution would be agent_step_1! -> agent_step_2! -> model_step!? The idea behind this is simply to have all scheduled agents act once, then have all scheduled agents act a second time again, then let the model step happen once.

I can of course think of workarounds for this, e.g. wrapping agent_step_1! and agent_step_2! within the model_step! function and call them from there (providing some custom built list of agents). However, I'm wondering whether there's an option for this built into the API of Agents.jl?

fbanning commented 3 years ago

Quick addition: Of course this inquiry could also be extended to include other scenarios like model_step_1! -> agent_step! -> model_step_2! and so on.

Datseris commented 3 years ago

I can of course think of workarounds for this, e.g. wrapping agent_step_1! and agent_step_2! within the model_step! function and call them from there (providing some custom built list of agents). However, I'm wondering whether there's an option for this built into the API of Agents.jl?

Yes, that is the solution.

One of my main tenants for designing easy to use, yet featureful software: Never expand an existing API to do something that is already possible, and straightforward, within the existing API.

However, here we can indeed discuss whether the case you describe is a common case. It is simple to extend the existing step! low level implementation so that you can pass in a vector of functions as agent_step! and then what you say is what happens.

But here is the counter argument: shouldn't you have a vector of schedulers as well? And, is the complexity worth it?

Now the ball is on you: you have to convince me :P A convincing enough argument would make it happen.

Datseris commented 3 years ago

Quick addition: Of course this inquiry could also be extended to include other scenarios like model_step_1! -> agent_step! -> model_step_2! and so on.

This is definitely not possible unless we come up with a clear and simple to understand way of how to do it.

I forgot to say: vectors of functions are type unstable and thus will lead to lower performance.

Libbum commented 3 years ago

You could also do:

step!(model, agent_step!, dummystep, 1)
step!(model, agent_step!, model_step!, 1)
fbanning commented 3 years ago

vectors of functions are type unstable and thus will lead to lower performance

Boooo, nobody likes lower performance. ;-)

shouldn't you have a vector of schedulers as well?

I mean it would also work with just one scheduler for multiple agent_step functions but what you proposed would indeed make sense in terms of customisability.

And, is the complexity worth it?

In my opinion complexity under the hood is worth it if the user has some gains in terms of simplification from it, e.g. a unified approach to define and schedule agent stepping functions.

you have to convince me

Not sure I'm in a position to do that.

fbanning commented 3 years ago

You could also do:

step!(model, agent_step!, dummystep, 1)
step!(model, agent_step!, model_step!, 1)

How would that work with the run! function then?

Libbum commented 3 years ago

You wouldn't use run! at all. We have a high level API using run! that collects all data for the most straightforward applications, then the lower level API based on step! and things like collect_agent_data! for use cases like this.

Take a look at this testset for example. Basically, you can just write your own custom run!.

fbanning commented 3 years ago

As far as I understood it, if I can't make use of run!, then I also can't make use of paramscan which propagates its keywords to run!. This would take away a lot of the features built into Agents.jl.

Libbum commented 3 years ago

A fair point, but if that's the only blocker then the solution may be quite simple.

@Datseris: we already pass agent_step! and model_step! function overrides into paramscan, we could also add run! to that list such that user defined run! functions are accepted (they only requirements are that they need to return two dataframes and accept the standard model, astep!, mstep!). Thoughts?

fbanning commented 3 years ago

The default values for each keyword is propagated into run! by paramscan. Custom run! functions would then not only need to have the standard form of the current run! function, but also take all the kwargs that the _run! function takes, otherwise errors would be thrown, I think.

Libbum commented 3 years ago

Yep, but splatting kwargs... solves those kind of complications.

Datseris commented 3 years ago

And, is the complexity worth it?

In my opinion complexity under the hood is worth it if the user has some gains in terms of simplification from it, e.g. a unified approach to define and schedule agent stepping functions.

This is not entirely correct. Allowing a vector of functions as well a single function for stepping increases the API. It makes exactly one more version of step! that takes the vector input. And you have to explain this extra version in the documentation string. All of this is very much on top of the hood.

you have to convince me

Not sure I'm in a position to do that.

But you're the one that wants this feature :P Who else should try to convince the developers that it's worth it :D

@Datseris: we already pass agent_step! and model_step! function overrides into paramscan, we could also add run! to that list such that user defined run! functions are accepted (they only requirements are that they need to return two dataframes and accept the standard model, astep!, mstep!). Thoughts?

Hehe, you see now how such a tiny suggestion like "allow vector of agent steps" spiralled into increasing the complexity of several Agents.jl functions...? ;)


I just want to point out, that not only it is a simple thing to include a bunch of agent_stepping functions in your model_step!, but also the entire API of Agents.jl play well with it. Why make things more complicated when it is already so easy to do what you want to do...? run! works with it, paramscan works with it, honestly, what's the problem...?

Perhaps we should approach this issue a different way. @fbanning would you mind telling us why the existing way to do it is bad? Perhaps if we first identify the problem and then come up with a solution will lead to better design. At the moment we have a discussion about possible implementations (solutions) but the problem is not yet clear to me.


(p.s.: I wanted to define the function schedule(abm::ABM) = abm.scheduler(abm) and make it part of the public API, so that users can feel more safe writing custom agent-stepping loops. I have simply forgotten to define this function yet, but I'll do it now)

Libbum commented 3 years ago

So just to be explicit here, 43c5b4b8a added the schedule function. For the specific use case of astep -> astep -> mstep, you can write

function model_step!(model)
    for id in schedule(model)
       agent_step!(model[id], model)
    end
...
# Model specific step properties
end

run!(model, agent_step!, model_step!, 10)

Edit: @fbanning perhaps we're not getting your full use case here. Can you tell me what you want to 'collect' out of this scenario? The above example using run! and paramscan would see this operation as one step. If you want to collect an agent property or aggregate property, then the result would be from the first instance and not the second. If the methods were extended to allow astep -> astep -> mstep for each 'step', how would you perceive we save step data here for example? You would overwrite data in the agents collection unless you had some secondary property, or made step a Float64, step 1.1, 1.2, 1.3 etc. IMO, that would drastically complicate things when doing your analysis on the DataFrame.

fbanning commented 3 years ago

would you mind telling us why the existing way to do it is bad?

I never said that. It's simply unclear to me how these things would be done, as I've described in the very first post of this issue. I wanted to know whether there's a builtin way to do what I want to do or if I need to apply my workaround. Apparently my workaround is exactly what you had in mind when you wrote the framework, so that's totally fine for me.

In addition, the public schedule function that you added just yesterday makes it clearer for users how to approach building a custom order of agent_step! and model_step! functions depending on the preset scheduler of the model. I think this is a nice addition and will help users in the future.

Besides all that, I can maybe elaborate where I'm coming from as a user, namely the world of NetLogo, so that you can take a step back and try to understand why I want to approach things the way I do. In NetLogo there's normally a go function which allows you to put any arbitrary number of agent_step! and model_step! function equivalents in any arbitrary order. This is not how it is with Agents.jl where by default you have to separate agent and model steps into two blocks which get executed sequentially. While the out-of-the-box arrangement with separate agent and model steps is indeed easy to approach in the beginning, it doesn't seem to hold up very well from a user perspective once we want to introduce some kind of "sub-steps" as described above (e.g. sequentially execute two agent_step! functions per step before executing the model_step! function). For this to work (and this is the workaround I came up with), the user needs to wrap any additional agent step functions within the model step function.

Now I might be thinking about this wrong here, but to me this kind of logically defeats the purpose of separating agent_step! and model_step! in the first place. Once a model gets a bit more complicated and requires more incremental substeps in an arbitrary order, the user has to explicitly put this order of agent actions and model functionality into model_step! anyways and agent_step! can then only be used for the very first (or last) agent actions that should be executed each step. "Automatic" scheduling of agents (as defined during model creation) only works for the agent_step! function passed to run! (or step! or whichever) and all other agent stepping functions need to explicitly call their scheduler of choice to iterate over.

Again, please don't get my comments wrong because I seem to understand now how you intended it to be used. Functionally the current implementation is perfectly fine and does indeed work as intended, so there's not much to discuss here. However, from my user perspective this predefined order of agent and model step functions feels inconsistent once a more complicated order is needed.

So while this wasn't my initial intent for opening this issue, if you actually want to draw anything out of it, it might be the above described situation. Hopefully I could describe my use case a bit better now and you could better understand where I'm coming from. :)

Libbum commented 3 years ago

We've been having a lot of design discussions recently about other frameworks, and whether or not their implementations are well constructed, have better implementations than we do, etc. Most of the time we have ended up settling on a different approach to the alternate frameworks on nearly every aspect—including moving away from being mostly a mirror of Mesa.

In the end, the current API is a have your cake and eat it too scenario. You can do it the Agents.jl way, but in addition, if you come from MASON, you can build up a model they way you like it. Now that I know you're coming from Netlogo I can give you a bit more of a run down.

Netlogo's go to us, does not align, since as you say it is the entire model run command. Consider the go function of Wolf-Sheep, where you iterate over all sheep, do the work, then all wolves, do the work, and finally update the model (grow some grass).

Our philosophy is to extract those sheep and wolf loops out into their own functions (agent_steps!). In Netlogo you would have to explicitly write your loops based on a scheduler inside your go function—Agents.jl takes care of that for you.

Now, there's no reason why you can't mimic a go function however. If that's what you'd like.

function model_step!(model)
    for id in schedule(model)
       agent_step!(model[id], model)
    end
    for id in schedule(model)
       agent_step_2!(model[id], model)
    end
...
# Model specific step properties
end

run!(model, dummystep, model_step!, 10)

That way satisfies everything I can think of in terms of how you'd translate Netlogo models almost directly into Agents.jl code.

Datseris commented 3 years ago

I think we should make this discussion concise and put it in the tutorial page. Specifically the example Tim just wrote, but replacing the agent_step! with more specific things like wolf_step!, sheep_step!, etc. I would also go as far as collecting all ids of wolfs, and all ids of sheeps separately.

fbanning commented 3 years ago

Yes, I understood how to build a stepping function that mimics NetLogo's go function. Still, thanks for writing it out again as I think this might be helpful for others reading it at a later time.

Our philosophy is to extract those sheep and wolf loops out into their own functions (agent_steps!).

If it's actually so easy to just explicitly write own loops over agents depending on any scheduler (which is basically just collecting a list of ids which in turn is similar to building an agentset in NetLogo), then why introduce this split of agent_step! and model_step! functions in the first place? I currently don't see any gain in extracting agent stepping functions (except for saving the user two lines of code for the necessary for-loop if it would be in a model_step! function).

(On a sidenote the Wolf-Sheep model is actually an extension on what I was asking about in this thread because it also implements two different schedulers (schedule agents by property, namely their breed) in addition to implementing two different agent stepping functions.)

Datseris commented 3 years ago

Thank you very much @fbanning for this helpful comment. I will take some time and read in detail and think about it a lot before answering. Since the original Agents.jl v1.0 release, most design decision have actually been re-written, some of them even completely from scratch. It is becoming more and more clear that having a clone of Mesa in Julia was not necessarily a great choice. One of the things that we never even considered changing was the stepping API.

So I'd like to reflect on this a bit more. Without having to think about it howeverr I can say for sure that we have to add a section about this discussion in the tutorial, to tell users that "yes, you can do whatever the hell you want inside model_step!, even performing successive nested steppings!".

fbanning commented 3 years ago

So I'd like to reflect on this a bit more.

It's not as if we're in a hurry here. It's all working perfectly fine and this discussion turned out to be a bit more fundamental than I first anticipated it to be (which I'm certainly not mad about because I think it's a thing worth discussing).

"yes, you can do whatever the hell you want inside model_step!, even performing successive nested steppings!"

😄

Libbum commented 3 years ago

How complex is your working model at the moment @fbanning? We were looking to release 4.0 in a day or so, but the discussion here may make us hold off on that, until we sanity check this aspect of the framework and possibly change it. Without getting into the technicalities, perhaps you could share with us an outline of the steps of your algorithm, how you would do it in Netlogo vs Agents.jl and your thoughts on the 'best' way to implement your problem in a perfect world.

fbanning commented 3 years ago

Our working model is actually not very complex. We need three agent stepping blocks/functions to be executed sequentially with the same scheduler (random in our case). They need to be executed sequentially because subsequent agent stepping blocks refer to agent variables of a subset of other agents that have been referred to in previous agent stepping blocks. If these agent_step!s would be implemented as usual, we would face an asymmetric propagation of certain agent variables throughout the model.

Take the following very abstract and simplified example:

mutable struct MyAgent <: AbstractAgent
    id::Int64
    a::Float64
    b::Float64
end

#define model with some agents and scheduler = fastest

function agent_step!(agent, model)
    agent.a = mean(agent.b for agent in allagents(model))
    agent.b += agent.a
end

# and so on

Running it like this, the last activated agent will calculate a significantly different a than the first activated agent. However, a should be the same for all agents as it is intended to depict the mean past value of the variable b of all other agents. This can easily be fixed by calling all agents to first calculate a, then call all agents again to calculate their b.

mutable struct MyAgent <: AbstractAgent
    id::Int64
    a::Float64
    b::Float64
end

# define model with some agents and scheduler = fastest

function model_step!(model)
    # first run this block for all agents
    for id in model.scheduler(model)
        agent = model[id]
        agent.a = mean(agent.b for agent in allagents(model))
    end

    # then run this block for all agents
    for id in model.scheduler(model)
        agent = model[id]
        agent.b += agent.a
    end
end

# and so on

(I'm very well aware that this very simple example could be easily written otherwise with the help of a model_step! function but that is certainly not the point I'm trying to make here.)

Does this description and example help at all?

your thoughts on the 'best' way to implement your problem in a perfect world

That's a tough one. I think I personally would drop the separation of agent_step! and model_step! as well as the definition of a standard scheduler in the model struct completely. Instead I would create something similar to the go function of NetLogo where users can explicitly define which agent or model functionality to call and also which scheduler should be used in the case of agent actions (defining order and subset of activated agents).

It could look something like this:

function schedule(model) # name is not important, could also be called "procedure", "go", "stepping" or whatever
    for agent in [model[id] for id in fastest(model)]
        # do agent stepping stuff, possibly just call another function which could be called agent_step_1!
    end

    for agent in [model[id] for id in random_activation(model)]
        # do agent stepping stuff, possibly just call another function which could be called agent_step_2!
    end

    # maybe also do some model stepping stuff here
end

run!(model, schedule, 10) # only need to provide schedule instead of agent_step! and model_step! explicitly

(Sidenote: Those for-loops within the schedule function are a bit messy/unwieldy, maybe the scheduler functions could already return a list of agents instead of ids or one could do it as I did in the examples above and define agent = model[id] within the loop. Both work but no idea if one is preferrable over the other.)

Libbum commented 3 years ago

This is helpful, thanks. Have already discussed this with George offline a bit. Essentially what you've laid out here.

Let us play around with some options and see what is the best course of action. I think in principle we agree that this gives us more flexibility and power, but we also want to limit such explicitness for new users.

The first-steps tutorial should still be able to define some quick agent behaviour and have it just work, without the need of understanding the scheduling system.

fbanning commented 3 years ago

Nice to hear that.

without the need of understanding the scheduling system.

Given that on the one hand schedulers are basically just creating lists of agent ids and on the other hand there are some of the most usual scheduler functions readily available to the user (keep in mind they already need to define scheduler = fastest in the model definition), I don't think it adds any complexity. In fact I think it could make users understand the scheduling system more easily if they explicitly see that schedulers are fundamentally just creating lists of agents to be activated.

Libbum commented 3 years ago

Indeed. I don't disagree with any of that statement, but it'd be nice to not be absolutely required for the first model a user sees. Second one: sure.

fbanning commented 3 years ago

Out of curiosity, are there any news on this topic?

Datseris commented 3 years ago

Not from me. I'll need at least a couple more weeks before saying something here.

Libbum commented 3 years ago

October has become complicated for all of us individually with other projects and commitments. I think it'll make sense to do a change, but exactly how-so we'll need to discuss.

fbanning commented 3 years ago

Thank you very much to both of you for the quick answers. I didn't want to urge you but just wanted to carefully ask to stay up to date on this topic. If there's anything I can help with, please feel free to get in touch. Otherwise I'll just wait and see how this topic evolves. :)

Datseris commented 3 years ago

If there's anything I can help with, please feel free to get in touch.

Yeap, review either #333 with respect to #332 , or discuss a solution for https://github.com/JuliaDynamics/Agents.jl/issues/320 . I have to do both of these before I can do this issue, so removing one from the list will make it faster.

Libbum commented 3 years ago

To start up discussion again here: I can see that we should write some documentation about how to transition from other ABM packages. Most of what would be needed for a Netlogo transition would be to rely heavily on model_step! and selective schedulers. Apart from a good documentation recipe, we already have that at present, with one caveat:

Since step! and run! use agent_step! -> model_step!, we can dispatch the function and drop model_step! when it's not needed. The other way is not possible though due to the n=1 parameter complication we've spoken about before.

I'd suggest the change we make to accommodate things would be to keep everything as-is, other than the agent/model step function order, so that step!(model, model_step!, 5) is possible. In this way, we can keep the Mesa-like under the hood agent_step! scheduling going on, but also benefit users who need the fine grained control of a model_step! and multiple schedulers solution.

Datseris commented 3 years ago

I agree, it makes more sense to demand model_step! always and optionally allow agent_step!. However the run! option with n::Function will always explicitly require both agent_ and model_step!.

Libbum commented 3 years ago

Yes, that's exactly what I meant if it wasn't clear enough.

If we've got a consensus here I can work on those changes.

fbanning commented 3 years ago

I agree, this will make the process a bit more straightforward as the currently necessary dummystep for unneeded agent_step! can be dropped.

In this way, we can keep the Mesa-like under the hood agent_step! scheduling going on, but also benefit users who need the fine grained control of a model_step! and multiple schedulers solution.

So to reiterate on this statement: agent_step! will not be kept for the functionality it provides because the same can easily be recreated with two LOC within the more generally usable model_step! function. Instead it's kept for ease-of-use (i.e. approachability of the framework) and to continually provide some similarity to the Mesa framework. Did I understand this correctly or did I misinterpret something?

Datseris commented 3 years ago

Yes, I really think we should highlight in the docs that the agent_step! is provided exclusively for convenience. In fact, in the Tutorial we should explicitly right out the "core loop"

for i in 1:steps
    for a in schedule(model)
        agent_step!(a, model)
    end
    model_step!(model)
end
fbanning commented 3 years ago

I think this would already be an improvement over the current design but it seems fair to ask the question then, why to keep agent_step! at all?

I'm wondering whether this won't be more confusing to new users. They read about the core loop and how to iterate over scheduled agents (explaining the core construct of how to build a custom order of stepping functions and implicitly also how schedulers work) but then we go on to point out the agent_step! function as an optional parameter for the step! function which is however always needed in the run! function. As a new user, this would seem a bit convoluted to me (and even for users coming from other frameworks like Mesa, this might not make sense).

Instead, couldn't it be more helpful to thoroughly explain how schedulers (basically iterations over a list) and custom orders of stepping subfunctions (basically a sequence of actions happening in the model, with or without the involvement of agents) work? Like that, users could quickly and more flexibly build more complex models once they need it as the model creation process will always be the same. This basically boils down to the little example I've written down a bit earlier on this thread.

Furthermore, no extra written lines would be necessary to completely avoid having agent_step! anymore. Instead of defining an agent_step! function

function agent_step!(a, model)
    # do things
end

we would have

for a in schedule(model)
    # do things
end

inside the model_step! function (or however we might call it) then.

Of course, integration of the "agent_step!` loop over the model's schedule in the "core loop" would not be needed anymore. The "core loop" would then fundamentally be reduced to:

for i in 1:steps
    model_step!(model)
end

If this seems nitpicky, then I'm sorry. I'm trying to find the easiest and most modular solution to building ABMs with Agents.jl and I think this approach would fit the bill.

Datseris commented 3 years ago

I think for situations where it is indeed useful, having such an agent_step! function would lead to much clearer code. You could say however, that this is a responsibility for the user, to write clear code and the software doesn't have to point that out.

To be perfectly frank however, you're right. It is not clear to me what's the purpose of agent_step!, exactly. Perhaps @kavir1698 who initially wrote the first version of step! and established the separation into model_step! and agent_step! can chime in and tell us what was the original idea.

kavir1698 commented 3 years ago

I originally saw the agent_step! as an easy way to write simple simulations. You define what an agent does and an scheduler, and the package takes care of the rest. model_step! was then added to be able to do the more complicated scenarios. Without the agent_step!, the user has to write an extra for loop. The disadvantage of having a agent_step! is some extra lines in the documentation.

Datseris commented 3 years ago

i agree, so let's just finish the documentation around this and wrap it up

fbanning commented 3 years ago

Without the agent_step!, the user has to write an extra for loop.

Right now, the user has do define agent_step! instead of writing a for-loop inside the model_step! function. As I've pointed out above, the number of LOC stays the same.

The disadvantage of having a agent_step! is some extra lines in the documentation.

From my perspective, the actual disadvantage is that this leaves a problem in terms of user experience. In a simple example model users are told to use agent_step! but once they want to do anything even a little more complicated, they need to resort to using model_step! anyways. Dropping agent_step! completely, and adapting the examples/tutorials accordingly, would make the most sense to me.

However, as I said above, I'm totally fine with the proposed solution of making model_step! required and agent_step! optional.

Libbum commented 3 years ago

In a simple example model users are told to use agent_step! but once they want to do anything even a little more complicated, they need to resort to using model_step! anyways.

I'm not sure this is entirely the case. We've managed to write all the current examples & integrations so far with the current methodology, as well as many more complicated models that users have produced for their own work.

The docs would need to reflect that agent_step! is there as a helper, not a rule—this would abate your problem with it I think. For sure, we don't want users to learn something they grow out of, but keeping it in helps us in two cases: first we help onboard users familiar with the separation (coming from Mesa). Second, we don't alienate our own users coming from 3.x: dropping it entirely means rewriting ALL models when upgrading, keeping it means swapping the order of a function call on one line. If we find it to be mostly unused in 6 months or so, then we can start depreciating it.

fbanning commented 3 years ago

Fair enough

fbanning commented 3 years ago

as well as many more complicated models that users have produced for their own work.

Somewhat off-topic: Do any of you keep a collection of more sophisticated ABMs written with Agents.jl, preferrably published in scientific articles of various disciplines? I've never done an in-depth search on this but maybe you have? The citations for the current Agents.jl paper seem minimal. I've also been searching on CoMSES and could only find one Julia ABM but it doesn't use Agents.jl. Would be very interesting to me to see how many people use Agents.jl in "production" and how complex the models written in it are.

Datseris commented 3 years ago

I've came to the realization that if model_step! is made the mandatory function (to which I agree with by the way) then it is anyways the case that most examples have to be re-written, since you can't have agent_step! without model_step! anymore, which was the default behavior previously.

fbanning commented 3 years ago

most examples have to be re-written

Doesn't seem like a big task at all? Just need to introduce a dummystep to the call to model initialisation and that should be solved?

Libbum commented 3 years ago

Just need to introduce a dummystep to the call to model initialisation and that should be solved?

Yeah, I'm pretty sure that's it. Are we missing something here George?

Datseris commented 3 years ago

Yeap this is easy to update, but it is nevertheless very breaking. (since a user that used the 1-step form only, now when their code is called this function will be interpreted as model step, not agent step.)

Libbum commented 3 years ago

We can easily add in @deprecate values for this - exactly how #317 does it.

fbanning commented 3 years ago

I think instead of (or maybe rather in addition to) using deprecate warnings, it might be good to have a direct message pop up in the REPL once people update their Agents.jl version to 4.0. I know that Pluto.jl has a custom REPL message after install/update, so maybe we could use something similar?

Datseris commented 3 years ago

I think instead of (or maybe rather in addition to) using deprecate warnings, it might be good to have a direct message pop up in the REPL once people update their Agents.jl version to 4.0. I know that Pluto.jl has a custom REPL message after install/update, so maybe we could use something similar?

Yes I have been doing this for years in the rest of JuliaDynamics and I'll do this here to. However:

We can easily add in @deprecate values for this - exactly how #317 does it.

That is not possible, because the call signature remains identical. How do you deprecate step!(::ABM, ::Function, ::Int) in favor of step!(::ABM, ::Function, ::Int) ?

Libbum commented 3 years ago

Ah, that's problem, yes. I thought we could do ::Function{A, B} where {A, B}, but that's not possible.