koopjs / koop-output-vector-tiles

Apache License 2.0
17 stars 9 forks source link

update geojson-vt, vt-pbf, fix null tiles, and add zlib #4

Closed reyemtm closed 5 years ago

reyemtm commented 5 years ago

creating a tiles array speeds up the returning the tiles, not sure if this is the best method but it works

reyemtm commented 5 years ago

I like all these ideas. Not sure how to implement some of them. I think I'll break this up into manageable requests.

In terms of the tile cache, timeout, caching with nginx etc., the time it takes to index and return tiles on the first run is so slow I would never want to run this without the cache built in. I was even thinking if there was a way to create a tile index on all the files at startup. If we can hook the tile cache up to the built in koop cache I guess that's the best approach, but would need help on that one.

On Thu, May 23, 2019, 10:17 AM Rich Gwozdz notifications@github.com wrote:

@rgwozdz requested changes on this pull request.

I like the idea of a tile cache, but there are somethings to work out:

  1. We should be able to switch caching on/off with a config value. The reasoning here is that some people will do tile caching another way (like with Nginx) and won't want this imposed.
  2. If the underlying data changes, you'll still be getting old tiles from the cache until Koop is restarted. You need an expiration on the cached tile sets.
  3. Is there a way we can leverage the Koop-cache for this? If we could use the Koop-Cache API then you would be able to leverage it's expiration handling, and also have the option of doing this in-memory or with Redis (helpful if you have load balancing going on).

In index.js https://github.com/koopjs/koop-output-vector-tiles/pull/4#discussion_r286692309 :

const log = new Logger()

+const tiles = new Array();

Any special reason why you use an Array here? Why not a plain object or a Map since you are accessing the cached tile sets with ids?

In index.js https://github.com/koopjs/koop-output-vector-tiles/pull/4#discussion_r286693226 :

 const start = Date.now()
  • const tile = geojsonVT(data).getTile(z, x, y)
  • const pbf = vtpbf.fromGeojsonVt({layer: tile})
  • if (!tiles[id]) {
  • const indexStart = Date.now();
  • log.debug(${id} waiting for geojsonvt to index tiles)
  • tiles[id] = geojsonVT(data, {
  • maxZoom: 17,

Store these options in the koop config, so that the can vary across instances. Some people may want different maxZoom, etc.

In index.js https://github.com/koopjs/koop-output-vector-tiles/pull/4#discussion_r286693480 :

 const start = Date.now()
  • const tile = geojsonVT(data).getTile(z, x, y)
  • const pbf = vtpbf.fromGeojsonVt({layer: tile})
  • if (!tiles[id]) {
  • const indexStart = Date.now();
  • log.debug(${id} waiting for geojsonvt to index tiles)
  • tiles[id] = geojsonVT(data, {
  • maxZoom: 17,
  • indexMaxZoom: 5,
  • buffer: 2048,
  • tolerance: 2
  • })
  • const indexEnd = (Date.now() - indexStart) / 1000;
  • log.debug(${id} tile indexing complete in ${indexEnd} milliseconds)

Nice debugging logs.

In index.js https://github.com/koopjs/koop-output-vector-tiles/pull/4#discussion_r286695182 :

 const duration = (Date.now() - start) / 1000
  • log.debug(output=vector-tiles tile=${z},${x},${y} features=${data.features.length} duration=${duration})
  • res.set({'Content-Type': 'application/x-protobuf'})
  • res.send(pbf)
  • log.debug(${id},output=vector-tiles tile=${z},${x},${y} features=${data.features.length} duration=${duration})
  • res.set({

Does changing the content type result in any compatibility issue with typical clients? JS API 4, leatlet, etc?

In index.js https://github.com/koopjs/koop-output-vector-tiles/pull/4#discussion_r286697017 :

const log = new Logger()

+const tiles = new Array();

Can you change the const name from tiles to tileSetCache? Sorry to nitpick, but I think it would increase readability.

In index.js https://github.com/koopjs/koop-output-vector-tiles/pull/4#discussion_r286951839 :

 const duration = (Date.now() - start) / 1000
  • log.debug(output=vector-tiles tile=${z},${x},${y} features=${data.features.length} duration=${duration})
  • res.set({'Content-Type': 'application/x-protobuf'})
  • res.send(pbf)
  • log.debug(${id},output=vector-tiles tile=${z},${x},${y} features=${data.features.length} duration=${duration})
  • res.set({
  • 'Cache-Control': 'public, max-age:3600',
  • 'Content-Encoding': 'gzip'
  • })
  • //zlib.gzip enables a 4x decrease in data size sent with no noticeable performance decrease
  • zlib.gzip(pbf, function(_, result) {

I can see why this is necessary, but I think we should go about it a bit differently since deployments vary in how compression is applied. For example, some prefer to keep compression out of Node.js and do it with Nginx. As an alternative, I would add compression to Express by default in koop-core (with a config options to disable it). If we do that, this won't be necessary, right?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/koopjs/koop-output-vector-tiles/pull/4?email_source=notifications&email_token=ABQ2HUJW6K27LX6BMKJDO6DPW2RRBA5CNFSM4HOXXPVKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOBZN2CUI#pullrequestreview-240886097, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQ2HUOQVRC6C3FGCCATCVLPW2RRBANCNFSM4HOXXPVA .

rgwozdz commented 5 years ago

Can you describe your use case? If your underlying dataset is large and doesn't change often, I wonder if it make more sense to pre-cook your tiles with something like tippecanoe and serve them statically?

reyemtm commented 5 years ago

Serving statically is one idea, and am already doing that to done degree, but I want to get away from multiple silos of data and having to keep track of updates, where one dataset is stored, etc. I already created a vt server which is where this code is from, but in that case I created the index at startup. Regardless I'm hoping for one solution that can meet most of my needs and this server is very close. I work for a small city and push parcel and utility data mostly.

On Thu, May 23, 2019, 8:16 PM Rich Gwozdz notifications@github.com wrote:

Can you describe your use case? If your underlying dataset is large and doesn't change often, I wonder if it make more sense to pre-cook your tiles with something like tippecanoe https://github.com/mapbox/tippecanoe and serve them statically?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/koopjs/koop-output-vector-tiles/pull/4?email_source=notifications&email_token=ABQ2HUMEME3MIOQTVJTHTQLPW4XXXA5CNFSM4HOXXPVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWD2HLA#issuecomment-495428524, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQ2HUM3TV45O6PFDN475RTPW4XXXANCNFSM4HOXXPVA .

rgwozdz commented 5 years ago

Ok, thanks for that. I guess you could build something into your server.js that reads all the data and creates all the tilesets at startup. I'd be a little concerned with how this will scale - will you run into memory limitations? Perhaps it could be avoided if you use Redis to cache your tilesets or tiles.

One reason why this is likely so slow is that the provider is handing the whole dataset off the the vt-plugin, which then creates the whole tileset. But this is all in response to the request for a single tile (x,y,z). So we're creating a lot of tiles that my not be immediately needed. An OSM server, for example, will only cut a single tile for a particular x,y,z request (and then cache it); moreover it only pulls data needed for the area covered by the tile coordinates. It would be kind of nice if we could replicate this pattern. It would mean extending whatever providers you are using to fetch and pass only the data needed for a given tile request.

Is your use of the file-geojson provider just for development/testing? I.e., are you using other providers in production?

reyemtm commented 5 years ago

Ok, I think I fixed most of the issues here. There are other fixes that are needed but this will get the plugin working. Other additions I plan to add are the following -

reyemtm commented 5 years ago

Glad to contribute I think I will work on the metadata tiles json now.

On Wed, Jun 5, 2019, 4:01 PM Rich Gwozdz notifications@github.com wrote:

@rgwozdz approved this pull request.

Thanks for you patience and the contribution.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/koopjs/koop-output-vector-tiles/pull/4?email_source=notifications&email_token=ABQ2HUPLIY3GPOVIGE5Z3XDPZALSHA5CNFSM4HOXXPVKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOB2WR5YI#pullrequestreview-246226657, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQ2HUN2HTPWI7NGF6YG65LPZALSHANCNFSM4HOXXPVA .

rgwozdz commented 5 years ago

@reyemtm - FYI, finding no difference in size when forcing gzip with the application/json content-type. I guess it may depend on the contents of the dataset itself. I'm going to default to 'application/x-protobuf' but make it configurable.

reyemtm commented 5 years ago

Sounds good. I just checked again and definitely seeing over 50% reduction in tile size, but the data is fairly complex to begin with, also I think the compression doesn't kick in until you reach a certain size limit, not sure what that limit is.

On Fri, Jun 14, 2019 at 2:50 PM Rich Gwozdz notifications@github.com wrote:

@reyemtm https://github.com/reyemtm - FYI, finding no difference in size when forcing gzip with the application/json content-type. I guess it may depend on the contents of the dataset itself. I'm going to default to 'application/x-protobuf' but make it configurable.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/koopjs/koop-output-vector-tiles/pull/4?email_source=notifications&email_token=ABQ2HUJLMIZDJAITWAUXOVTP2PR6NA5CNFSM4HOXXPVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXXVDNQ#issuecomment-502223286, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQ2HUPCEIONDPDRCAQDHEDP2PR6NANCNFSM4HOXXPVA .

-- Malcolm T Meyer