Open sbma44 opened 4 years ago
I'm still working through the cloudformation setup in this repo, but here are the two instance types provisioned in the Mixed Instance Policy:
Instance vCPU Memory (GiB) Instance Storage (GiB) Cost (per hour)
r3.8xlarge 32 244 2 x 320 $2.656
r5d.4xlarge 16 128 2 x 300 NVMe SSD $1.152
We are going to run a cost analysis with this configuration as-is this week to get more detailed results, but any insight into improvements on that end would be very helpful.
some preliminary benchmarks. I need to run a few larger pbfs to get a sense of the curve. this is a 4-core machine with 16GB RAM, disk-based osmium indexes and an ok SSD (SATA, not true NVMe speeds). That's enough to get this back to being CPU-bound; I'm tentatively encouraged.
note that the red series is running osmium with sparse_file_array
--this is an on-disk index but unlike dense_file_array
it doesn't allocate a full 60-some GB file at the start of a run. The docs suggest that dense
is a better option for full-planet exports but I'll need some more benchmarks to know what the breakeven point is.
Quick braindump of where things are:
QA tiles is no longer a good fit for Mapbox's processing infrastructure. We'd like to open source the key components. HOT has already been given the key ingredients -- Dockerfile & worker script -- and had them working on a cloud machine (they might no longer be working?).
In its current form the script is optimized to run on a machine with lots of CPU and memory resources (16 vCPU, >100GB RAM) and relatively little disk space (maybe 300 GB for planet file, geojson and output mbtiles). It runs in about 9 hours. I've been exploring how to trade some disk space and speed for the ability to run on more modest hardware. This centers on osmium's index settings and, to a lesser extent, turning off some tippecanoe options that become sources of inefficiency when things are i/o-bound instead of CPU-bound.
I've been benchmarking on a 4 core Ryzen using a PBF of Greenland. My next steps are to benchmark some different-sized PBFs in an effort to get a realistic estimate of how long planet mbtiles generation will take when on a memory diet.
Questions I have: