Closed lgarrison closed 3 years ago
Agreed - we already have the necessary bits for a counting sort. I found this medium article to be very helpful with understanding counting sort.
We already have the count-array -- the lattice structure itself. We would have to create a temporary copy to perform the sort, but then we can free that after the sorting into cell-order is complete.
The counting sort, while fast, requires a (temporary) copy of the entire memory area being sorted. May be we should stick with a different sort.
Superseded by #240.
The new option to set
copy_particles=False
in v2.3 is slow in gridlink! There's a single-threaded sort of particles into cell order. For a 1 billion particle dataset, this takes 760 seconds instead of 50 seconds ifcopy_particles=True
.I think we can do better than the sort: we've already constructed the cell indices of the particles; we just need to place them there. The out-of-place algorithm is straightforward (it's what we're already doing when
copy_particles=True
); the in-place algorithm needs some more thought but should be possible.