johnBuffer / VerletSFML-Multithread

Multithreaded deterministic minimalist Verlet solver
MIT License
422 stars 54 forks source link

Idea for a speedup - are there any issues with this? #3

Open Jack-Rickwood opened 1 year ago

Jack-Rickwood commented 1 year ago

So, this lags pretty badly on my computer when you make the world size much larger than 300x300, no matter how many particles there are. I think this is just because of the sheer number of cells it needs to loop through.

So i'm wondering, is there any reason you loop through every single cell? Seems super unnecessary, why not just loop through all the cells with particles in them? The way I would implement this is by having a vector in the PhysicsSolver class, to store all the cells with particles in them. This vector would get cleared at the beginning of each loop, and then in the method to add the particles to the grid, you could just push the index of the current particle in the grid to the back of this vector.

I guess there may be some issues with threading, but my approach is to 'pad' the vector of active cells until it's a multiple of the thread count.

So, is there any issue with doing this?

Theuntextured commented 1 year ago

I feel like it might even slow it down. Every loop you still need to check which cells are full or empty so you're not saving on any computation.

Jack-Rickwood commented 1 year ago

I mean storing references to the active cells to keep track of them, you don't need to loop through the whole grid then

johnBuffer commented 1 year ago

I tried something like this in the past and it speeds thing quite significantly when the grid occupancy rate is quite low but it quite quickly slows down as the occupancy increases. I currently prefer to optimize for the worst case by just keeping the spatial partitioning quite naive.

Orbital-Web commented 1 year ago

(I'm sorry if you already implemented this!) How about storing the top-left and bottom-right cell whose region includes every object (updating these values whenever a new object is added) and iterating only within that region, rather than the entire simulation space.

That way if all the objects are clustered at the bottom half of the screen, it will only iterate through the cells at the bottom half of the screen.

It would probably be something like storing 4 variables row0, col0, row1, col1, and every time a new object is added to (row, col), you'd update the variables as row0 = min(row, row0), row1 = max(row, row1), etc. You'd then probably clamp these values to exclude the borders and finally loop through like

for (int i=row0; i<=row1; i++)       // has to be <= in case row0 == row1, e,g. with 1 particle
    for (int j=col0; j<=col1; j++)

I'm not sure how much of a difference this makes but I'd assume it would help speed things with low occupancy and would not create much of an overhead even with a high occupancy.