tilezen / mapbox-vector-tile

Python package for encoding & decoding Mapbox Vector Tiles
MIT License
245 stars 47 forks source link

Improve performance: Python and C++ #72

Open lexman opened 8 years ago

lexman commented 8 years ago

Hello,

in order to increase encoding performance, we'll to convert some part of the code to C/C++ in submodule.

We've already forked the project for working freely, but our purpose is to submit a pull request when we're done. That's why we want to share the design to make things right the first time.

According to our tests (in python 2), 40% of encoding occurs in the _geo_encode() function, and that's the first one we'll translate. We'll try to produce a drop-in replacement for this function. This will allow to keep the python implementation for plateforms where C++ extensions can't be installed. We also want be compatible with both Python 2 and Python 3.

What do you think ? Do you have any suggestion or requirement ?

Alexandre

flippmoke commented 8 years ago

@lexman I am putting the final touches on https://github.com/mapbox/wagyu right now - which will be responsible for making all our geometry valid and supported for v1/v2. This is exciting because it is a header only library -- rather then having reliance on other large libraries. Which is really exciting because it will unlock me to start working on adding an encoder to https://github.com/mapbox/vector-tile.

I would highly recommend the python implementation to use these C++ libraries once they are available. This will help resolve #42 for this library as well.

rmarianski commented 8 years ago

@lexman thanks for offering to do this work, and for letting us know in advance! We would appreciate a pr here, and here are my immediate thoughts:

@flippmoke that looks interesting, and longer term I agree that it would be much better to converge on a shared implementation for this logic. @lexman for an initial pr, I'd rather that the scope be as limited as possible, and down the road we can consider replacing the validation logic that we currently have in place with wagyu.

lexman commented 8 years ago

Hello,

@rmarianski I perfectly understand your concerns about maintaining native code, and that's why I've made contact in the first place. If we go down that path, be sure we'll be willing to maintain the code for the 13 milion users of our french website http://mappy.com .

But for the moment I'm submiting a PR only for improving the python version of the code : simplier code will be easyer to translate. We'll see how it goes from there.

@flippmoke thanks for the tip ! We'll sure have a deep look on this promissing library...

jalessio commented 7 years ago

@lexman what's the status of your performance improvements for this project? I looked around Mappy/mapbox-vector-tile but couldn't quite work out where things landed. Were you able to speed up any of the slower Python parts with your C++ code? If so, which branch should we be looking at? Any instructions you can provide? Thanks!

jalessio commented 7 years ago

@nvkelso I was asking around at the Mapzen booth at SOTMUS and it was suggested that I ping you about this issue. I'm trying to sort out if anyone has made concrete progress on C-based performance improvements to this library.

I'm having trouble figuring out if anyone is really using the changes proposed by @lexman and the Mappy team. Do you have any info on that or other performance improvements that look promising? Many thanks!

nvkelso commented 7 years ago

@jalessio Hello again :) @rmarianski can speak to this when he's back in the office Friday.

rmarianski commented 7 years ago

I can confirm that we are using the changes from https://github.com/tilezen/mapbox-vector-tile/pull/74 in production, which while had some improvements in performance, IIRC was more focused on preparing for a C based implementation. If there's been further improvements, we'd be happy to accept a pr for it. But at this time, we are using the v1.2.0 tag in production.

flippmoke commented 7 years ago

@rmarianski, at Mapbox are slowly working towards making some more generic pieces for making vector tiles. So that we can have individuals use smaller pieces that do exactly what is needed for encoding/decoding/creating vector tiles. If you would want to re-use any of our code or contribute both would be appreciated! Most of our code is being focused around using a specific internal C++ geometry structure (geometry.hpp). However, we are attempting to make more generic encoding/decoding libraries available as well. https://github.com/mapbox/vtzero

rmarianski commented 7 years ago

@flippmoke thanks for the heads up. Are you planning on making a python wrapper? It would be interesting to look into what it would take to use vtzero as the implementation.

flippmoke commented 7 years ago

@rmarianski no we are not planning on making a python wrapper for it, but it should be really fast if it was wrapped in a python.

lexman commented 6 years ago

Hello @jalessio, sorry for this late answer but I've left the tiling team for a year now. At the time, we didn't go to the C++ implementation as advertised because the 25% performance we gained was enough at the time. This performance improvement is included version 1.1 and 1.2 that have been used in production at https://mappy.com/ for a year now and it works well.

If you need more speed and load improvements and if you are using a postgis database, I would strongly suggest to take a look at new Postgis function ST_AsMVT ( http://postgis.net/docs//ST_AsMVT.html ) that have been available since september (version 2.4). It directly encodes the vector tile in the database with a C library. Very efficent.

So it's clear now that we won't make the C++ implementation. Should I close the issue ?

jalessio commented 6 years ago

@lexman thanks for the update!

I would like to use ST_AsMVT but we're using Postgres/Postgis on AWS RDS and as of now they don't have protobuf support on their instances :(