dhardestylewis / terrain_aggregator

Workflow to aggregate terrain imagery at scale to a single seamless image dataset
10 stars 4 forks source link

New computational gridding #71

Open dhardestylewis opened 1 year ago

dhardestylewis commented 1 year ago

tldr; new gridding scheme

ref; old gridding scheme

scratch sheet for reference to set up new griddings on different systems

general approach to estimate maximum number of compute threads (ie CPUs) available for mass parallelization

50 jobs available max in Stampede2's normal queue (wait-time 40 hours) minus 2 jobs left open for long queue (wait-time 7 seconds) equals 48 jobs on Stampede2 normal queue

64 maximum recommended cores per node on normal queue

69 maximum available nodes per job (topping off at requesting 69*48=3312 nodes out of 3360 total normal nodes available -- we want fewer because there will likely be a small number of nodes down at any given time, ~10)

48 jobs 64 cores per node 69 nodes per job = 211968 cores available, one for each separate computational grid

Deriving computational grid from parallel threads available

sqrt(211968) ~= 460, so gridding will be 460x460 because this is simple, close enough, less than the total number of cores available, and won't add any significant amount of time to the overall compute

Deriving tiles per compute grid

note the rest of these estimates are derived from previous computational results and assume minimal overhead & linear scaling between computational runs

If new total computational envelope of Texas is similar to previous envelope and adjusting for the smaller width x height of each new tile, there will be an estimated 175x175 tiles within each subgrid. (New envelope will be slightly larger because will rely on HUC8s instead of HUC12s)

Estimating compute time under new computational grid

Previous gridding took 120 hours to retile 71 billion pixels within each subgrid, new gridding estimated to take 4 hours to retile 2 billion pixels within each new subgrid

dhardestylewis commented 1 year ago

memory should be less of an issue this round because we are running roughly half as many subgrids per node in parallel as previous run. Previous run lost 7/272 subgrids to memory errors

dhardestylewis commented 1 year ago

estimated to cost 14,000 SUs

dhardestylewis commented 1 year ago

these are just the data tiles, not viz tiles

dhardestylewis commented 1 year ago

align tiles with quarter quads

dhardestylewis commented 1 year ago

at 1m gridding by request from @TNRIS

following previous estimate above and adjusting from .25m basemap to 1m basemap

1m basemap gridding scheme