Mismatch in ForestFire implementation

Corvince commented 1 year ago

So I spent quite some time figuring out how to improve the performance of mesas ForestFire implementation. I couldn't believe such a large discrepancy for such a simple model. Most of the time is spent on attribute access and function calls. That is not a nice place to oprimize.

Then I looked more closely at the Agents.jl code and I think Agents.jl is almost cheating ;) Its a clever implementation, but it is very different from the other frameworks. Agents.jl uses dummy agents and a agent dummy. That is instead of using actual Agents on an actual Grid, it uses a matrix which holds the tree states. So instead of looking up tree attributes and calling a step function for each tree the logic is implemented only on this matrix.

Replicating the same approach in Python, I see a ~10x performance increase for Mesa.

Now I don't know which side should be changed - make agents.jl implementation more standard and explicit, but slower, or allow other frameworks to use the same tricks to make them faster. For Mesa, I could send in a PR

Datseris commented 1 year ago

Agents.jl isn't cheating, have a look at the declaration file:

https://github.com/JuliaDynamics/ABM_Framework_Comparisons/blob/main/ForestFire/DECLARATION.md#technical-implementation

the only reason to include forest fire is exactly to test what optimizations frameworks can do in cellular automata. If you have a faster version for Mesa go ahead and PR it, that would be totally fair.

Agents.jl has infrastructure that allows for such optimizations in cellular automata, such as the nearby_positions function that loops over nearby positions in an optimized radius-agnostic way.

Corvince commented 1 year ago

Ah yes, sorry, I have missed the readme. Indeed, it explicitly states that agents are not needed. I still find it strange to compare an optimized model with a non optimized in the first place, but I am happy to provide an optimized version for Mesa via PR

Datseris commented 1 year ago

sure, but we can't always know what's a Mesa optimized model, as we are Agents.jl users and not Mesa users, so contributions like yours are needed.

Tortar commented 1 year ago

I think that indeed with a numpy array it could be possibly also done in Mesa, but it is not clear to me how much it would be faster for the overhead in the conversion, but if you have a better implementation please do a pr @Corvince !

Corvince commented 1 year ago

I think that indeed with a numpy array it could be possibly also done in Mesa, but it is not clear to me how much it would be faster for the overhead in the conversion, but if you have a better implementation please do a pr @Corvince !

Yes with numpy one could build a more equivalent example, which might even be slightly faster, but I think a list of lists should also be fine. Submitted a PR

Tortar commented 1 year ago

I think that anyway ForestFire should be reimplemented in a way such that the fire starts randomly on the grid, we already discussed the implementation and a sort of optimization similar to yours in https://github.com/JuliaDynamics/ABM_Framework_Comparisons/pull/13 but didn't decide to implement, of course I will merge your new way, but I think that it is a bit ill-suited the ForestFire model right now since the dynamics is a bit too simple

Tortar commented 1 year ago

But actually rethinking about it, the way it is now it's probably good enough

Datseris commented 1 year ago

personally I am not particularly happy with PRs #65 and #64 : I think we are losing track here. Focusing too much on the tree and losing sight of the forest (pun intended). When we originally designed this repository, we had in mind the Julia benchmarks . They compare programming patterns when implemented in each language. They do not compare "what's the fastest way to solve problem X with given language", and they make this clear in the benchmark declarations.

Yet, that's exactly what we have started doing here: "what's the fastest way to evolve an ABM with given rules in the used framework". We have shifted focus from comparing usage patterns in ABM to micro-optimizations. I would argue that neither #64 nor #65 are a "typical" programming pattern in ABMs, and in fact, neither uses any public API besides the function that iterates over nearby positions. In PR #64 the only line that actually uses Mesa is line 45: for nx, ny in self.grid.get_neighborhood(... and similarly for the Agents.jl implementation. We already compare this function in Schelling model, so what comparison do we gain from the Forest Fire model anymore? The rest of the code is how quickly Julia or Python iterates over a tracked list of 2-tuples.

@Tortar raised a good point: the forest fire is too trivial to highlight any meaningful usage of an ABM framework. Perhaps we should consider discarding it. Or, we should consider adding very strict technical implementation guidelines in the declaration that ensure that more of the API of each framework is used so that we compare ABM frameworks and not the core of the programming languages.

@Corvince feel free to counter-argue or prove me wrong about the above points of course.

Datseris commented 1 year ago

p.s. I am in favor of discarding it. We can add a graph based model if we want to have four examples.

Corvince commented 1 year ago

@Corvince feel free to counter-argue or prove me wrong about the above points of course.

Quite the opposite! I am very delighted to hear your opinion. I think you are on point with everything you wrote. Before opening this issue I thought about how to make a point about abandoning the forest fire example. After I realized that the Mesa example couldn't be sped up in a way that would reflect typical Mesa code, I was kind of annoyed to have this example here, because as you said it actually only benchmarks a single public API endpoint - the neighborhood 'search'. And yet it makes mesa look extremely slow compared to Agents.jl - 500x slower. When it only shows that neighborhood search is much faster in Agents.jl and more complex models like Wolf-Sheep are an order of magnitude faster.

Only after that I realized that Agents.jl wasn't actually using agents, didn't put any content on the actual grid and isn't using a scheduler. So I very much agree with your point that both models don't use regular patterns (after my PR) - I just think that was already true before #65. But now I think I should have made that point more explicit in the first place.

In any way I support removing ForestFire as being too simple. It shows that Julia is a fast programming language - but not much about framework code.

Datseris commented 1 year ago

ok i'll merge both #64 and #65 (mainly because we should keep @Corvince in the commit history) and then Tortar you can make a PR that removes the forest fire and adjust the comparison table accordingly.

Tortar commented 1 year ago

Then we all agree, I will remove the model soon, and I also agree on replacing it with a graph-based model instead. But which one? Do you have any proposal? It would be surely better if we can test a model which is already implemented in all frameworks.

Corvince commented 1 year ago

Actually I think most of the examples suffer from similar issues - the number of actual API calls is quite low in all of the examples.

In all honesty I don't see much benefit in these performance benchmarks as they are now. For small and simple models the difference is irrelevant - even the slowest implementation will be plenty fast. And for large and complex models most of their computational time will be spent outside of the framework code. I think it doesn't add much more than saying Julia is fast(er).

I think there are very few models where the performance difference between the frameworks tested here really matters. Again, for small models or classical ABMs it wont matter. And if someone is trying to simulate millions of agents for thousands of time steps I am pretty sure thats above Agents.jls limits with respect to the typical patterns used.

Corvince commented 1 year ago

Just to be clear about my opinion about Agents.jl. If someone were to ask me what modeling framework I would recommend for ABMs and they have no prior programming knowledge, I would say Agents.jl - but not because of its performance advantage.

Corvince commented 1 year ago

Then we all agree, I will remove the model soon, and I also agree on replacing it with a graph-based model instead. But which one? Do you have any proposal? It would be surely better if we can test a model which is already implemented in all frameworks.

Mesa currently only has a graph-based example of a version of the boltzmann-wealth model and the virus on network model (link to netlogo)

Datseris commented 1 year ago

Adding a graph model should be done in a different step than removing forest fire, because we would need to think about it. I now realize that adding a graph model is not trivial.

When I say "graph model" i mean a model where the space is a graph. However, when most people say "graph model" they mean a model without any space and where instead the agents are verteces of a graph.

I do not believe that there is reason to compare the latter model, because it would just use the Graphs.jl package for all agent interactions and Agents.jl would only offer a basic looping over agents. However, I also know that neither MASON or NetLogo have a GraphSpace the way Agents.jl has, so this would make the comparison difficult. I am not sure about Mesa.

What I would consider as a graph model is this one, but also with the addition of agents moving between cities: https://juliadynamics.github.io/Agents.jl/stable/examples/sir/

Tortar commented 1 year ago

I'm not sure that it is so true that the number of API calls is low in all examples: WolfSheep has several API calls in Agents.jl (and I'm implementing some more utilities to make it even more so).

Even if It is true that these models are just simple ones, they are used to make a comparison in the first place otherwise it would be impossible with real world models since they would take too much time to implement. Also consider that in all the other elementary models here tested (I profiled them to say so) almost all the time is passed inside the framework itself, not in any method outside of the control of the ABM Framework (at least in agents.jl). So the point that they just show that in general Julia is fast is not really right for this reason.

And also since Julia is fast also Agents.jl should be fast enough, otherwise it would create too much overhead to use the framework, and so the benchmarks show also this: the framework is fast enough to make the comparison generally in line with the difference in performance between languages, but it wouldn't be so if the library hadn't been realized with care on the performance (the first versions of the library weren't that fast, probably the first version was even slower than Mesa).

Datseris commented 1 year ago

@Corvince

For small and simple models the difference is irrelevant - even the slowest implementation will be plenty fast. And for large and complex models most of their computational time will be spent outside of the framework code. I think it doesn't add much more than saying Julia is fast(er).

Unfortunately I couldn't possible disagree more. There can be very big differences in performance between frameworks when it comes to agent related operations. A notable one is finding nearby agents or moving agents. I am personally especially proud of the search for nearby agents implementation of Agents.jl, which has been improved over v1 (which was same as Mesa v1) all the way to v5. We have practically re-written this method from scratch 3 times over.

For many real world ABMs the majority of time will be spent in interacting with nearby agents so this matters.

Besides,

For small and simple models the difference is irrelevant - even the slowest implementation will be plenty fast.

This is missing the point of the comparison. The simpler the model, the most percentage of the time will be spent on AMB library API calls which is exactly what we want to compare. We want small simple models. The larger / more complex the model, the less we are comparing ABM software and the more we are comparing the languages. Hence, while it's true that "small models are fast", absolute speed is simply not what this repository cares about.

Datseris commented 1 year ago

I think it doesn't add much more than saying Julia is fast(er).

This is getting a bit out of topic now w.r.t. to the original issue at hand, but I've heard already twice now from Mesa developers that "Julia is faster than Python so this doesn't say much...", so I need to point out the flaw in this logic.

The Mesa team could have made Mesa.jl, or re-written Mesa with a Julia backend. The programming language used is just another tool. Just like using knowledge from predicting spatiotemporal chaos is another tool that accelerates neighbor searches, just like using Distributed computing when scanning over parameters to parallelize model runs. Everything is a tool and we as developers choose at will the tools we use.

I find it odd that the fact that Agents.jl is written in Julia is dismissed even though it was rather obviously a very central and very conscious decision to develop the framework in this language. So with this in mind, I just want to make my view on the matter clear: the reason I would prefer this comparison repo to have as much API calls as possible within the ABM framework is to highlight how simple the implementations are within each framework without the user having to hand-code everything, not to avoid comparing the programming languages. The programming languages used should be compared as they are the most critical components of the libraries.

To re-iterate, the purpose of this repo is two fold: it does not only compare performance in terms of computation time and memory allocation; it also compares simplicity. Both of these purposes benefit in clarity by having implementations that show typical ABM programming patterns and implementations that utilize as much of the public API as possible.

JuliaDynamics / ABMFrameworksComparison

Mismatch in ForestFire implementation #60