tilezen / mapbox-vector-tile

Python package for encoding & decoding Mapbox Vector Tiles
MIT License
240 stars 47 forks source link

Enable Shapely speedups when available #91

Closed jalessio closed 7 years ago

jalessio commented 7 years ago

I stumbled across the fact that Shapely has the option to enable "speedups"!

http://toblerity.org/shapely/manual.html#performance

The shapely.speedups module contains performance enhancements written in C. They are automaticaly installed when Python has access to a compiler and GEOS development headers during installation.

I added a check to run speedups.enable() when it's available. It's worth noting that it looks like Shapely has this enabled by default in versions > 1.6. I don't see any reason that this change is incompatible with the new default setting, though and this will help everyone who can't/won't/hasn't upgraded Shapely in a while.

In my benchmarking I consistently see about 3 seconds shaved off the benchmark script in this repo (bench/bench_encode.py). I observe this same performance improvement when using both the Ptyhon protobuf implementation and the C++ protobuf implementation. On my hardware that means an 18 second test becomes a 15 second test (Python protobufs) and a 12 second test becomes a 9 second test (C++ protobufs). So, that's roughly a 16% and 25% relative improvement, respectively.

I created a gist for testing this using Docker and included a few results: https://gist.github.com/jalessio/7f7a0d73347e51effb11deda3a034f90

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.01%) to 97.17% when pulling b3391187cb87ae33d4b8dd6e55f5edfdb695ea53 on CalthorpeAnalytics:enable-shapely-speedups into ee6e64c4917c1c1c476f69deb28206b1c44192a7 on tilezen:master.

nvkelso commented 7 years ago

Nice! Thanks for including the GIST for testing.

jalessio commented 7 years ago

I’m noticing now that in my benchmarks, the faster branch runs fewer "function calls" (961,132 fewer!). Now that’s either:

  1. the actual result of the optimizations (i.e. where the performance improvements are coming from)
  2. a benchmarking artifact

I thinking/hoping it’s the first option. I'm open to ideas on how to prove/disprove those theories.

master: 14925438 function calls (14876891 primitive calls) in 16.969 seconds PR branch: 13964306 function calls (13915759 primitive calls) in 14.102 seconds

14,925,438 - 13,964,306 = 961,132

iandees commented 7 years ago

I'm willing to bet that the fewer function calls are coming from fewer python function calls. Those calls are probably made in C code now 😄 . Either way, that's a good thing.

iandees commented 7 years ago

Thanks @jalessio!