Reallocate particle arrays

gandalfcode / gandalf

GANDALF (Graphical Astrophysics code for N-body Dynamics And Lagrangian Fluids)

GNU General Public License v2.0

44 stars 12 forks source link

Reallocate particle arrays #61

Closed giovanni-rosotti closed 7 years ago

giovanni-rosotti commented 8 years ago

Currently we don't have any facility for reallocating the particle/cells arrays when needed; the solution we use is simply crashing. This is particularly relevant for MPI, where you might end up importing a lot of particles/cells and then the simulation crashes. To be fair, we already have the AllocateMemory function; but it can't be used to reallocate as it doesn't preserve the existing data. The obvious solution would be just to copy the old values to the new array. However, it might be useful to use this possibility to think whether there is anything we want to change in the way we store the particles in memory (e.g. c array vs vector, the orrible pointer casting, ...). Btw, is iorder still used? As far as I can see the only place it is used is in ReorderParticles, but I can't find any call to that function

rbooth200 commented 8 years ago

I had a bit of a look into memory management. My opinion is that vectors are probably the best solution for memory management for large arrays.

Vectors are also the best solution for the neighbour lists since we need fast access and occasional resizing. They also have the advantage of carrying around their size with them. I'm pretty convinced we should get a net speed up in the neighbour search by switching to them.

Small arrays (< 1 Kb) might be best allocated on the stack though.

dhubber commented 8 years ago

I think iorder is still used whenever particles are removed (e.g. sink accretion), although it is not the most efficient use of it. The plan was to use iorder when reordering the particles during tree construction (to obtain a more cache efficient particle ordering). However, this is not done currently because I think most of the benefit is obtained in creating the local neighbour lists.

rbooth200 commented 8 years ago

I agree that most of the benefit is in creating the local neighbours, but since the tree walks are now the most expensive part of the code any optimization in these routines is likely worth it. The other argument is that domain decomposition breaks the strict iorder-ing. Since we need to support random particle ordering anyway I think it's worth leveraging any benefit we can get out of the sorting. Reducing the tree node / particle size will also.

Sorting the particles will make the tree construction more expensive, but once the tree is partially sorted the cost is likely to be negligible on future rebuilds.

I will look into the cache efficiency aspects carefully over the next couple of days.

dhubber commented 8 years ago

I wanted to talk about arrays and memory management when I was in Cambridge but we did not have time. I was also pondering about using vectors (or other STL containers) but also I was experimenting with my own Array class (one for heap allocation and one for stack memory). The reason for 'reinventing the wheel' was because both C++ and STL vectors do not have any bounds checking when accessing elements with the [] operator, at least not in the standard (and don't seem to have any way of activating this for debugging purposes). Since this is a probable source of errors, it would be good to change to something that can also help with this problem. If however, more recent versions of the STL in C++11/C++14 have bounds checking, I will be very happy to use them everywhere!!

rbooth200 commented 8 years ago

I would not recommend building your own container for the sake of it. Especially when it comes to vector since the one of the big advantages of vector is the fact that it can grow as/when necessary and this reallocation is not easy to do in an exception safe way. If bounds checking is what you want, then there is a compile time flag in gcc that does that for you, see here [1]. Beware that it is very slow and should not be used in production runs.

The best solution for bounds checking is a combination of debug mode (at high debug levels) and using bounds-safe c++ idioms, such as using vector.size(), iterators and range-based for (c++11/14).

[1] https://gcc.gnu.org/onlinedocs/libstdc++/manual/debug_mode_using.html#debug_mode.using.mode

giovanni-rosotti commented 7 years ago

See #78. Richard and I have considered moving to vectors, but this would most likely break the code of people that have done modifications in their local versions to GANDALF. Therefore it would need to be coordinated with them.

giovanni-rosotti commented 7 years ago

Closing as this is now solved in #78