mapbox / mapnik-omnivore

Node module that returns metadata about spatial files.
45 stars 19 forks source link

Deeper max zoom for small data sources #151

Open rclark opened 7 years ago

rclark commented 7 years ago

In https://github.com/mapbox/mapnik-omnivore/pull/141 we enforced that a tileset always gets built down to at least z6. Still, we have problems with tilesets being generated from very small data sources.

A data source composed of a single point, for example, will get tiled down to z6 and simplification at that zoom level may lead to it appearing in completely the wrong place when viewed at z14.

As a solution, what if we allowed the dataset to continue being tiled at higher and higher max zoom levels as long as the total number of tiles at any zoom level does not exceed some threshold? This would solve the single-point problem, as well as scenarios where a few points are well-clustered. Small datasets spread out across the globe would still be problematic, since we'd probably have to calculate num tiles based on the BBOX of the data source.

cc @mapsam @GretaCB @colleenmcginnis

mapsam commented 7 years ago

Per discussion with @rclark, some things to explore here:

I'll plan on exploring different ways to maximize our zoom level for these persnickity little files. Some thoughts:

  • Aim for generally tiny, point-based datasets (if we can detect these with mapnik's data density logic, can we just tile them down further?)
  • How can we detect small files? Byte size is a first step, but even few features could have very large attributes & metadata, thus increasing their file size.
  • Explore a way to determine some sort of data density / extent ratio that will allow us to expose these files that have few features covering a large area.
  • This all basically happens in this function: https://github.com/mapbox/mapnik-omnivore/blob/master/lib/utils.js#L6-L48
springmeyer commented 7 years ago

A data source composed of a single point, for example, will get tiled down to z6

I think a single point actually gets tiled down to z22. Two points, a good distance apart, is when the problem hits.

Anyway, huge 👍 to a better solution or heuristic.


Purely as a :trollface: (or dirty workaround until a fix lands) before signing off here I'll mention that because the current heuristic works on file size and the mapnik JSON parser ignores whitespace, the problem can be mitigated slightly by padding the file with whitespace.

Demo:

echo '{"type": "FeatureCollection","features":[{"type":"Feature","properties":{}, "geometry": {"type":"Point","coordinates": [-122.41882503032684,37.737165553897675] }},{"type":"Feature","properties":{}, "geometry": {"type":"Point","coordinates": [168.1474095582962,-46.931000694550974] }}]}' > feature-collection2.json

^^ that only gets tiled to z6 😿

python -c "for i in range(0,10000000): print ''" >> feature-collection2.json

^^ padded to be 10MB, then it tiles down to z8, but that is still not at all good enough

python -c "for i in range(0,100000000): print ''" >> feature-collection2.json

^^ padded to be 100MB, then it tiles down to z10, better but still awful

mapsam commented 7 years ago

I think a single point actually gets tiled down to z22. Two points, a good distance apart, is when the problem hits.

Truth.

Much hack.

mapsam commented 7 years ago

Noting that after https://github.com/mapbox/mapnik-omnivore/pull/165 we are now tiling small point datasets down to z10: https://github.com/mapbox/mapnik-omnivore/blob/master/lib/utils.js#L67-L86 - this is far from complete, but the dataTypeMaxZoom method allows us to determine deeper max zooms for particular datasets.