Closed cldellow closed 7 months ago
That's really impressive - thank you!
I haven't had the chance to go through all the source yet but the results look very impressive - I ran my usual Europe extract through it (with shapefiles), and memory consumption was 8GB when reading the .pbf, going up to 9GB when generating tiles. Total time 2hr13. I'll have a go at the planet tomorrow.
Using the old (mid-2021) planet I've run previous tests with, and including shapefiles, memory consumption was 18.2GB - which is amazing. Total time 5hr39. (Before this PR it was 5hr12 and 40.2GB.)
Comparing with Europe, that suggests a very rough estimated RAM requirement of one-third the .osm.pbf size.
Played with this a bit more today and still impressed. Also thanks for the copious comments which help me to understand what's going on!
I think the only suggestion I'd make is that we now have a fairly broad array of performance options (--no-compress-nodes
, --no-compress-ways
, --materialize-geometries
, --shard-stores
, plus of course --store
and --compact
have performance implications). I suspect most users won't understand which to pick.
I guess there are three common scenarios:
These could perhaps be represented by the following run-time options:
--store /path/to/ssd --fast
(equivalent of --materialize-geometries
on, --shard-stores
off)--store /path/to/ssd
(equivalent of --materialize-geometries
off, --shard-stores
on)We can then simply tell people "if you have lots of memory and are working with a big extract, use the --fast
option".
We can still retain the granular controls, but maybe put them in a separate "performance tuning" option group.
Yes, good call on the flags and de-emphasizing the individual knobs. I'll make that change.
Hopefully you ignored the noise of my commits during Christmas! :) Please don't feel any urgency to do anything with this or the other PRs I'll open this week -- this is just my version of tinkering with trains in the basement over the holidays.
Since my last comment:
I did some benchmarking [1] and observed that the logic should maybe be:
--lazy-geometries
, e.g. in the case where lazy geometries is enough to let you avoid needing --store
--store
is passed, default to lazy geometries
--materialize-geometries
if they have really, really fast SSDsThe --help
after this commit:
tilemaker v2.4.0
Convert OpenStreetMap .pbf files into vector tiles
Available options:
--help show help message
--input arg source .osm.pbf file
--output arg target directory or .mbtiles/.pmtiles file
--bbox arg bounding box to use if input file does not have
a bbox header set, example:
minlon,minlat,maxlon,maxlat
--merge merge with existing .mbtiles (overwrites
otherwise)
--config arg (=config.json) config JSON file
--process arg (=process.lua) tag-processing Lua file
--verbose verbose error output
--skip-integrity don't enforce way/node integrity
--log-tile-timings log how long each tile takes
Performance options:
--store arg temporary storage for node/ways/relations data
--fast prefer speed at the expense of memory
--compact use faster data structure for node lookups
NOTE: This requires the input to be renumbered
(osmium renumber)
--no-compress-nodes store nodes uncompressed
--no-compress-ways store ways uncompressed
--lazy-geometries generate geometries from the OSM stores; uses
less memory
--materialize-geometries materialize geometries; uses more memory
--shard-stores use an alternate reading/writing strategy for
low-memory machines
--threads arg (=0) number of threads (automatically detected if 0)
[1]: Details in https://github.com/systemed/tilemaker/pull/618/commits/657da1ab92fcf65de3f5adafcceddc064ef5e73d - it wasn't quite this branch, it was this branch + protobuf + lua-interop
All working really well! Ready to merge, do you think?
Running this PR with Great Britain on my usual box:
/usr/bin/time -v tilemaker --input /media/data1/planet/great-britain-latest.osm.pbf --output ~/tm_debug/gb5.mbtiles
Elapsed (wall clock) time (h:mm:ss or m:ss): 4:59.99
Maximum resident set size (kbytes): 12275684
/usr/bin/time -v tilemaker --input /media/data1/planet/great-britain-latest.osm.pbf --output ~/tm_debug/gb4.mbtiles --lazy-geometries
Elapsed (wall clock) time (h:mm:ss or m:ss): 5:16.00
Maximum resident set size (kbytes): 9155756
It's a big memory saving (25%) for a small time penalty (5%) - so maybe we should default to --lazy-geometries
, both for in-memory and --store
. But I realise one could probably bikeshed this all day. :)
Yup, merge away.
I have no strong views on the defaults--let me know if you'd like them changed
Merged. Thank you again - this is going to make a massive difference to users.
I'll do some experimenting with the defaults before we release 3.0 but it's not crazily urgent.
This PR lets Tilemaker build the planet on smaller machines.
On a Vultr 16-core, 32GB, 500GB SSD machine:
Runtime for non-memory constrained boxes isn't affected, e.g. on a Hetzner 48-core, 192 GB machine:
On a $ basis, if you're renting a machine to do the work, it's cheaper to use a bigger box. But for folks who need to use what they already have, this may be a useful PR.
The changes are a mix of using less memory, spilling more things to disk, and thrashing less when things are backed by disk.
Using less memory:
--materialize-geometries
to points -- points fromLayer(...)
can be looked up in the NodeStore.LayerAsCentroid(...)
still needs the point storeAttributePair
std::string
withPooledString
AppendVector
) rather than a vector of vectors for storingOutputObject
sSpill more things to disk:
OutputObject
s now spill to disk when--store
is usedThrash less:
--shard-stores
is set, split theNodeStore
andWayStore
into 7 stores that cover different parts of the globeReadPhase::Ways
will run 7 times, populating a singleWayStore
on each pass. Only those ways whose first node is in the correspondingNodeStore
get populated. Because nodes in ways generally are geographically near each other, we'll mostly be accessing a singleNodeStore
to process the way. ThatNodeStore
fits into memory for the duration of the pass, avoiding disk I/O.ReadPhase::Relations
behaves similarly, using the ID of the first way to decide whether to process the relation.Potential future improvements:
These are mostly smaller issues that can be happily ignored forever, just wanted to write them down so I can forget about them.