Open mts42 opened 6 years ago
Hey, thanks for pointing this out! Certainly something that I'd like to fix (see also #8).
I've been writing a little benchmark suite to aid in understanding these kind of performance issues, but it's not quite done yet, but I agree that your initial findings seem to indicate incorrect scaling.
To be honest, I haven't even looked at the temporal performance section of the paper, as I was more interested in getting the result out first, but this will be a decent performance optimisation to tackle now that things are functioning.
So I took a closer look and the culprit is likely to be: https://github.com/Libbum/spherical-cow/blob/b10190dc285bab95abd5edd4bb9891a0a695dce1/src/lib.rs#L260-L268
Since `spheres' grows with each iteration and there are number of spheres n iterations, we get O(n^2) complexity. To get the O(1) time complexity for getting the nearest neighbours you need an O(1) access to the nearest neighbours, e.g. by overlaying a grid of domains in which you compare against a fixed number domains which contain spheres. However, that approach can be rather problematic for arbitrary shapes (e.g. a non-spherical cow) since most grid space would be empty. A k-d-tree or ball tree would be the general solution, but then on average you get O(log n) access complexity, resulting O(n*log n) total complexity. Still a pretty good step up, so https://github.com/Libbum/spherical-cow/issues/12 is pretty relevant.
It should be possible to put the nearest-neighbours-logic in Container, enabling Cuboid packers like me to pack in O(n) time. Once I get around to further coding, you can expect a PR for the Cuboid case.
Agreed on all counts, and I can confirm O(n^2) on my side - b10190dc adds a benchmark (that can be run via cargo bench
) using a spherical container which expands over [2,4] radii that produces the dependence:
Since we use a regular grid to find spheres’s neighbors
Is listed in the temporal performance section of the paper, but they never discuss how they implement this - I'll dig a bit more in that direction.
Great, sounds good!
Ah. So I was skimming some of the author's other papers. Seems that they're using the D-cell method outlined in Han et al., which I was investigating in the first place (Libbum/space-habitats#1). I have bits of an implementation lying around, so I can take a look at how that works out.
It would be a general solution as far as I can tell, so may not need to do anything else. Happy to take a look at your PR of course, but also if you want to hold off on that for a few days until I've got this done, that works too.
(Sorry for the delay. I'm moving cities & don't have my dev machine with me yet. Will finish this up next week)
Hi,
I measured the time taken for some domain sizes and thereby different number of spheres. But what I'm seeing is a O(n^2) behaviour instead of the O(n) behaviour the paper shows. The approach in the paper looks fine from a rough overview, so I guess that there's some kind of compare-against-all-spheres in the main loop which would give O(n^2) behaviour. I'll take a closer look if you can confirm the O(n^2) behaviour on your end.
Adjusted example + measurements when running cargo run --release --example show_in_cuboid https://gist.github.com/mts42/0cfcd31036ae77f154575c566fcfae6a