hotosm / hot-qa-tiles

QA Tiles Generation
2 stars 0 forks source link

moving QA tiles off Mapbox infrastructure #9

Open sbma44 opened 4 years ago

sbma44 commented 4 years ago

Quick braindump of where things are:

QA tiles is no longer a good fit for Mapbox's processing infrastructure. We'd like to open source the key components. HOT has already been given the key ingredients -- Dockerfile & worker script -- and had them working on a cloud machine (they might no longer be working?).

In its current form the script is optimized to run on a machine with lots of CPU and memory resources (16 vCPU, >100GB RAM) and relatively little disk space (maybe 300 GB for planet file, geojson and output mbtiles). It runs in about 9 hours. I've been exploring how to trade some disk space and speed for the ability to run on more modest hardware. This centers on osmium's index settings and, to a lesser extent, turning off some tippecanoe options that become sources of inefficiency when things are i/o-bound instead of CPU-bound.

I've been benchmarking on a 4 core Ryzen using a PBF of Greenland. My next steps are to benchmark some different-sized PBFs in an effort to get a realistic estimate of how long planet mbtiles generation will take when on a memory diet.

Questions I have:

dakotabenjamin commented 4 years ago

I'm still working through the cloudformation setup in this repo, but here are the two instance types provisioned in the Mixed Instance Policy:

Instance    vCPU    Memory (GiB)    Instance Storage (GiB)     Cost (per hour)
r3.8xlarge  32      244             2 x 320                $2.656
r5d.4xlarge        16       128         2 x 300 NVMe SSD       $1.152

We are going to run a cost analysis with this configuration as-is this week to get more detailed results, but any insight into improvements on that end would be very helpful.

sbma44 commented 4 years ago

some preliminary benchmarks. I need to run a few larger pbfs to get a sense of the curve. this is a 4-core machine with 16GB RAM, disk-based osmium indexes and an ok SSD (SATA, not true NVMe speeds). That's enough to get this back to being CPU-bound; I'm tentatively encouraged.

note that the red series is running osmium with sparse_file_array--this is an on-disk index but unlike dense_file_array it doesn't allocate a full 60-some GB file at the start of a run. The docs suggest that dense is a better option for full-planet exports but I'll need some more benchmarks to know what the breakeven point is.

image image