Efficient encoding of points

mourner commented 9 years ago

Encoding a huge number of points in VT is currently pretty inefficient in terms of output size. This is especially a big problem on the highest zoom levels where POIs and address points are encoded. Let's discuss what we could do about that in a future major spec revision to make encoding optimal.

This is how points are currently encoded in a layer:

{
  "id": 3359,
  "tags": [0, 561, 1, 123, 2, 456],
  "type": 1,
  "geometry": [9, 1050, 8692]
},
{
  "id": 3360,
  "tags": [0, 548, 1, 234, 2, 457],
  "type": 1,
  "geometry": [9, 1071, 8694]
},
{
  "id": 3361,
  "tags": [0, 549, 2, 458],
  "type": 1,
  "geometry": [9, 1072, 8693]
},
...

Problems with the current approach:

each point is a separate message, so each needs at least two bytes for a tag + message length
tag for each field (id, tags, type, geometry) is repeated over and over
additionally, tags and geometry are packed fields so need 2 bytes for lengths
the same type is repeated everywhere
each geometry has a superfluous moveTo command
the same key indexes are typically repeated over and over
geometries are not delta-encoded

Some of the repetition gets compressed away by gzip, but there's still a lot of data we could eliminate with a smarter encoding approach. Here a rough idea how the sample above could be encoded optimally:

{
  "id": [3359, 3360, 3361, ...],
  "tags": [561, 123, 456, 548, 234, 457, 549, 0, 458, ...],
  "geometry": [3336, 8692, 21, 2, 1, 1, ...]
}

a set of points is one big message with a few packed fields
ids are packed together
tags are packed together and only have value indexes; each layer key has a corresponding value encoded, so if a particular key is not present in a feature, we encode 0 (this won't occur too often so the gain from dropping keys should far outweigh the need to include such zeroes)
point geometries are packed together and delta-encoded (which should be a big size win); for efficiency of delta-encoding, we sort points before encoding (e.g. in a z-curve order, by a morton hash)

To understand how much of a win this will be in terms of compression, I'll make a quick proof of concept encoding on sample z15 point data and post back here.

cc @yhahn @springmeyer @flippmoke @kkaefer @tmcw @sgillies @ericfischer

mourner commented 9 years ago

I made a proof of concept encoding to compare the size wins. A sample z15 tile containing only poi_label and housenumber_label layers weights 44.36K gzipped with the current approach, and 31.88K gzipped with the proposed one. This is a 28% cut, which is pretty significant considering it's gzipped. Non-gzipped sizes are 104.87K and 59.94K respectively.

mourner commented 9 years ago

I got the previous result with z-curve ordering. If I sort by id instead, and delta-encode ids, this brings the sample down to 26.04K. This is a 41% cut.

mourner commented 9 years ago

Note that the proposal is totally bluesky, just to start a discussion, it's not something we necessarily have to do in the next spec version.

Here's the code for proof of concept conversion: https://gist.github.com/mourner/a7c6ea76f450bb92e316. I'll have to repeat the tests after fixing big OSM ids which contribute too much to the size — this will produce more fair results.

javisantana commented 7 years ago

This approach has also some advantages (apart from the size reduction). When working with WebGL for example you work with array buffers so if the data is stored in the right way it'd be possible to just memory-map the XMLHttpRequest.request response to a typed array in javascript (the data should be padded to sizeof(typedarray type))

That would require changing the varlen encoding for a fixed byte encoding but I think is much better to allow the compressor do its jobs rather than being smart moving bits here and there and would be better than having to decode byte per byte. Not sure if protocol buffers support that kind of thing.

I understood from a conversation with @springmeyer that the way MVT was though was because you wanted to have compressed features and decompress them on demand.

andrewharvey commented 5 years ago

I thought I'd share a benchmark I while comparing against a point could encoded in draco vs mvt.

The mvt was 171K uncompressed, 68K with gzip and 45K with brotli. The draco encoded data was 14K. I didn't test out any of the ideas in this thread, but my test showed draco to be a 79% cut over gz mvt.

Results would vary a lot based on the type of point cloud data being encoded, but still, it would be interesting to explore what techniques draco uses and if they could be applied to further improve encoding points in mvts.

UPDATE: I think draco might be doing more lossy compression, although I can't tell by looking at the outputs by eye, I'll need to do further benchmarking to ensure I'm comparing for equivalent detail.

mapbox / vector-tile-spec

Efficient encoding of points #35