m3g / CellListMap.jl

Flexible implementation of cell lists to map the calculations of particle-pair dependent functions, such as forces, energies, neighbor lists, etc.
https://m3g.github.io/CellListMap.jl/
MIT License
87 stars 4 forks source link

Update previous cells instead of re-initializing all cells #89

Open efaulhaber opened 1 year ago

efaulhaber commented 1 year ago

Currently, UpdateCellList! clears all cells and adds particles again. For use cases like SPH simulations where only small changes in position happen between updates, it will be much faster to only update cells that changed, using the existing cell lists from the previous step.

lmiq commented 1 year ago

That is not trivial, because the construction of the lists must run over particles. One has to check the cell for each particle, and then adding it to a list is not more expensive than doing any kind of cell update. (I do not clear the particle lists memory - wise).

In MD what we do is to not update the lists at every simulation step. We compute the lists up to a cutoff greater than the actual simulation cutoff and update the lists only at every few steps. Possibly by following the displacement of the particles in an auxiliary array.

(Most times packages store the explicit neighbor lists between steps, but this may be prohibited for very large systems).

Maybe something or the sort can be done in SPH simulations as well.

lmiq commented 1 year ago

By the way, from what I remember of the presentation of Trixi.jl, the scalability you are getting with it is of another order of what CellListMap can deliver. So I'm unsure if you will be able to use it for the kind of simulations your group is used. Nevertheless, I would be very grateful if you provide any kind of feedback on what could be missing, or what could be improved about the usability here, which may guide me towards future improvements. Also, if there is any kind of system for which CellListMap is useful to you, I would love to know. Thanks for the feedback and contributions.

efaulhaber commented 1 year ago

Thanks! I am currently working on an SPH multiphysics code, which is separate from Trixi.jl, but we're trying to get a similar multithreading scalability as with Trixi.jl. So far, I've been using my own cell list implementation for the neighborhood search, but now I implemented one based on CellListMap.jl, and I am looking to compare the two.

lmiq commented 1 year ago

Do you need periodic boundaries? Most of the complications here are to support triclinic PBCs.

efaulhaber commented 1 year ago

We want to have periodic boundaries soon, that's also a reason why I implemented the NHS based on CellListMap.jl. Our implementation doesn't support that.

lmiq commented 1 year ago

Maybe helpful:

Screenshot_20230607_073153_Chrome.jpg

The issue with the scaling here is on the construction of the cell lists. The mapping scales well for systems large enough.

However, with many cores, the cell list construction end up being the limiting step sooner or later.

In my tests, simply reducing lists that are computed in parallel is enough to break the scaling. Probably to improve on that we need to compute redundant lists that are not reduced.

efaulhaber commented 1 year ago

I found similar results in my implementation. Reducing was actually worse than the serial implementation for me. I tried one list per thread, which made the update scale perfectly, but ended up ruining the map speed because I had to iterate over all lists.

I ended up looping (multithreaded) over all cells, marking all cells containing particles that are supposed to be in another cell. Then, in a serial loop, I update the marked cells. This doesn't scale for full updates (all particles changed cells), but it scales well when only a few particles changed cells.

lmiq commented 1 year ago

Ah, I see, running over all cells to check which cells have to be updated can be multi-threaded, and then that can be useful. I think something like that can be implemented here as well. I'll take a shot at it when I have some time.

efaulhaber commented 2 months ago

I recently implemented a cell list neighborhood search that works with GPUs. For that, I had to write a fully parallel update with atomic operations. This might be interesting to you.

https://github.com/trixi-framework/PointNeighbors.jl/pull/42

I now get much better scaling of the incremental update step. I haven't implemented the initialization yet, but I'm sure it will scale similarly. grafik "semi parallel" refers to my previous strategy that I explained earlier.

I ended up looping (multithreaded) over all cells, marking all cells containing particles that are supposed to be in another cell. Then, in a serial loop, I update the marked cells.

lmiq commented 2 months ago

Do have already some instructions on how to use it? I'm certainly interested.

(does it support already PBCs?)

efaulhaber commented 2 months ago

We extracted the neighborhood search from TrixiParticles.jl to a new package. The main functions are initialize!, update! (self-explanatory) and foreach_point_neighbor, which is similar to your map_pairwise!. Check out the docs for these functions here: https://trixi-framework.github.io/PointNeighbors.jl/dev/reference/

We also have a PR for a CellListMap.jl backend, which just wraps a CellListMap in our API, so that we can use CellListMap.jl as a backend for TrixiParticles.jl and to easily run benchmarks against your implementation. https://github.com/trixi-framework/PointNeighbors.jl/pull/8

Our implementations only support coordinate-aligned rectangular boxes for PBC, but after the PR above is merged, I'm planning on using your complex PBCs. Then users can use our GPU-compatible NHS for simulations on GPUs and the CellListMap.jl backend for complex PBCs.