Open nmandery opened 5 years ago
Looks like a 2-3x speedup. Very cool. :)
@joegilley it looks like you have a target. ;)
@nmandery we have known that the current polyfill algorithm is suboptimal -- it was just the easiest to prove correct under all circumstances.
There was a much better polyfill algorithm implemented in a prototype a few years ago (by @joegilley :) ), and with @nrabinowitz having implemented a line algorithm a month or so ago, it should be possible to implement the better polyfill algorithm for most situations on top of it.
I will leave it up to @isaacbrodsky if H3 will have an optional dependency on OpenMP. Likely the current algorithm will remain a fallback so a speedup could be nice to keep around, but this breaks the current assumption that the library will only burn a single core, which can affect capacity planning.
Also, since polyfill was designed to make sure contiguous polygons do not share any H3 indexes, it is possible for the performance-minded to "carve up" the polygon into sub-polygons and execute polyfill on each of them in their threading library of choice and see a similar speedup, though that is not nearly as turnkey as this.
There was a much better polyfill algorithm implemented in a prototype a few years ago (by @joegilley :) ), and with @nrabinowitz having implemented a line algorithm a month or so ago, it should be possible to implement the better polyfill algorithm for most situations on top of it.
It's worth noting though:
So this is probably still valuable, and I really like how pluggable it is. But I definitely think this needs to be a build-time option (not just dependent on environment, but on an explicit flag), and I'm fairly certain it won't work at all with the Emscripten-compiled code.
There was a much better polyfill algorithm implemented in a prototype a few years ago (by @joegilley :) ), and with @nrabinowitz having implemented a line algorithm a month or so ago, it should be possible to implement the better polyfill algorithm for most situations on top of it.
It's worth noting though:
* It's not yet clear whether the new algo will work effectively outside a single base cell * It definitely won't work across > 2 base cells * And we still need the point-in-poly check regardless, because the old algo was only convex hull.
So this is probably still valuable, and I really like how pluggable it is. But I definitely think this needs to be a build-time option (not just dependent on environment, but on an explicit flag), and I'm fairly certain it won't work at all with the Emscripten-compiled code.
...Likely the current algorithm will remain a fallback...
I know @nrabinowitz :) I also think your proposal that it's a compile-time flag to include it or not is probably the right way to go, but it does present issues for the Java, Go, and Python bindings:
n - 1
of their cores?When we go from 1 core basic C functions to a multicore approach, lots of these thorny questions pop up that very much depend on the use-case and resource constraints of the user.
@dfellis OpenMP allows setting the number of threads C functions like omp_set_num_threads or even using the environment variable OPENMP_NUM_THREADS
. There are also a few more variables to control its behavior. This whole thing is a bit special and should definitely be a compile time option.
I also do not know if it really makes sense to include into h3 itself, it is just a patch I am using which helps me for my niche usecase where I need polyfill a lot.
@nmandery your use-case is totally valid and we should try to support it! I'm just trying to bring up as many issues and alternatives as possible so the final result covers as many use-cases as possible while ideally also causing us the least maintenance headaches. :)
Hi @nmandery, thanks for posting this! Great to see the speed up.
I don't think we want to merge this into uber/h3 right now, under the assumption we want to keep the C sources as dependency-less, (although this is pretty self contained within the library.) and because of the binding issue (such as with Emscripten) mentioned above.
@isaacbrodsky That is perfectly understandable and makes sense to keep number of dependencies the library as low as possible. My intention with this issue was just sharing my findings.
For my part I am fine with compiling H3 with this patch when I need it.
@isaacbrodsky does that mean the focus will be on implementing the improved polyfill algo to get a similar speedup in a single core?
I would caution that parallelization may be unwanted in some applications, and can interfere with the performance of these applications. If parallelization is introduced to some of these functions, it really must be optional.
I do not know if this an issue there is any interest in, but I noticed the library can potentially benefit from parallelizing parts of computing-intensive functions.
I experimented a bit and used OpenMP to simply parallelize the check if a hexagon is contained in a polygon in the
polyfill
function: https://github.com/nmandery/h3/commit/65afcb16579e833083feba713a89d7423611212aResults without openmp:
Results with openmp:
These are the specs of the machine I used:
I guess the usage of OpenMP should be optional, as I do not know how this would affect the emscripten based Javascript port of h3.