systemed / tilemaker

Make OpenStreetMap vector tiles without the stack
https://tilemaker.org/
Other
1.42k stars 228 forks source link

pmtiles root/leaf index size issue ? #653

Open cquest opened 7 months ago

cquest commented 7 months ago

I've compared a full planet pmtiles file generated by pmtiles and one generated directly by tilemaker.

The nginx log show regular multi megabytes ranges requested on the timaker version, and nothing like that on the pmtile version.

I suspect the tile index not being hierarchical enough.

You can query both files here if you want to have a look :

systemed commented 7 months ago

I think this is probably because tilemaker doesn't generate clustered pmtiles archives:

Setting the clustered property of a PMTiles archive means that the ordering of the tile data on disk matches the directories

Because tilemaker's tile generation is multi-threaded, and some tiles may take much longer to generate than others (due to complex geometries), we can't guarantee that tiles will be output in any particular order. Therefore we don't get the efficiency gains that a clustered archive would give.

For a clustered archive, you'd need to create an .mbtiles with tilemaker, then use go-pmtiles to turn that into a clustered .pmtiles.

There's some discussion of this in the original PMTiles PR, #620, in particular:

Threading means tilemaker's tile output order isn't sequential, so we can't set .clustered

The only consequences here should the directories take up more bytes, and you can't use pmtiles extract on an output. For cloud storage there isn't a huge locality advantage in accessing nearby parts of the same file

cquest commented 7 months ago

Thanks Richard, I'll go that way, mbtiles + pmtiles conversion.

One way to deal with unordered generation, is to add some inbetween queue. Threads fill the queue, another thread takes what is available and ordered in the queue to put that in the final pmtiles file. (easy to say, more work to implement it)

systemed commented 7 months ago

(easy to say, more work to implement it)

😁 Yes, you're right. I'm slightly anxious about a queue getting blocked on a tile with a really horrible multipolygon geometry (Saimaa or the US National Forests, that sort of thing) but there are possibilities for the future.