JuliaDynamics / Agents.jl

Agent-based modeling framework in Julia
https://juliadynamics.github.io/Agents.jl/stable/
MIT License
733 stars 119 forks source link

Potentially more efficient `ContinuousSpace`/`GridSpace`? #597

Open AayushSabharwal opened 2 years ago

AayushSabharwal commented 2 years ago

I came across this recently: https://stackoverflow.com/a/48384354/6744592

It seems like a really interesting implementation of a grid-based data structure for storing moving agents and performing circle/bounding box checks, which is of course exactly what we need. Could it potentially be applied to GridSpace? Some experimentation would be necessary for scaling this to arbitrary dimensions.

We likely lose the neighbourhood caching that is done in the underlying GridSpace, but only implementation and benchmarking will tell if it's worth.

This is all of course up for discussion. I just shared it because it looks really cool.

Datseris commented 2 years ago

It looks promising, but unfortunately without benchmarking comparing it to current implementation, we can say pretty much nothing. A question I would try and ask is "why would I expect this to be faster" (or slower), and based on that start seeing reasons why to implement a rough draft for benchmark comparisons.

E.g.: What computations done with current setup would be completely skipped with the new proposal? Are there some new computations that would be added? What about memory requirements? (our current approach has practically 0 memory footprint, only storing this trivial Hood struct)

AayushSabharwal commented 2 years ago

but unfortunately without benchmarking comparing it to current implementation, we can say pretty much nothing

Yeah

What computations done with current setup would be completely skipped with the new proposal?

I read the entire thing once. They mentioned it performed better than their fixed-grid approach. However, their application seems to be handling efficient collision and rendering of multiple moving bodies of various sizes. We of course don't have the concept of sizes (yet?). Considering that searching in a radius twice is practically free with our approach, I'm actually not so sure now that it will be beneficial.

What does seem interesting now that I've read a lot more, is making the data layout more cache friendly. Our current implementation has nested arrays, which of course involves looking up a pointer to get another pointer which gives us our data. Additionally, all the arrays are likely fairly small.

I wonder if we can take their approach, where there is one array in a row, and each array element is basically a linked list node containing the index of the next element in the list, along with the data value (agent ID). Relocating elements in a row thus requires simply changing the next index of an element. Deleting elements from an array adds that node to a list of free elements, and any insertion into the array just replaces a free element. O(1) deletion and insertion, contiguous and fairly static memory allocation once the number of agents starts levelling out

Maybe we could have just one such array for the entire grid. The downside here is to access all agents in a grid cell we'd probably end up hopping across larger chunks of the array, affecting cache coherency.

Datseris commented 2 years ago

Especially after #643 I find it hard to believe that this exceptionally complicated data structure would be faster. But I guess we leave this open in case someone wants to have some summer-time-programming-practice and implements it.