Closed mojodna closed 9 years ago
Pick one of the datasets in https://github.com/openimagerynetwork/oin-register/blob/master/master.json
Here's a screenshot (nothing fancy) of the UAV imagery of Dar es Salaam in the HOT bucket:
It's using FUSE to read data and is incredibly slow (as we expected). I fixed up the startup time by preventing mapnik-omnivore from calculating statistics and suspect that the next major speedup could come from rewriting the images (they may not be internally tiled) so that partial file reads can be more efficient (these are multi-gigabyte files to begin with).
This is the command I'm using to rewrite one of the images (copied from S3):
GDAL_CACHEMAX=512 gdalwarp \
-t_srs EPSG:3857 \
-wo NUM_THREADS=ALL_CPUS \
-multi \
-co tiled=yes \
-co compress=lzw \
-co predictor=2 \
-r bilinear \
2015-05-20_tandale_merged_transparent_mosaic_group1.tif \
2015-05-20_tandale_merged_transparent_mosaic_group1_tiled.tif
(GDAL_CACHEMAX
speeds this up noticeably)
Reprojection appears to have been the biggest win in that conversion. I wonder if yas3fs is introducing more overhead than expected (a quick glance at top made it look like Python was fully pegging a single CPU).
I tried various encodings locally. This results in a slightly larger TIFF, but renders the fastest:
GDAL_CACHEMAX=512 gdalwarp \
-t_srs EPSG:3857 \
-wo NUM_THREADS=ALL_CPUS \
-multi \
-co tiled=yes \
-co sparse_ok=true \
-co interleave=band \
-co compress=lzw \
-co predictor=2 \
-r bilinear \
2015-05-20_tandale_merged_transparent_mosaic_group1.tif \
2015-05-20_tandale_merged_transparent_mosaic_group1_tiled_sparse_band.tif
JPEG compression produced the smallest file (about 25% of the size):
GDAL_CACHEMAX=512 gdalwarp \
-t_srs EPSG:3857 \
-wo NUM_THREADS=ALL_CPUS \
-multi \
-co tiled=yes \
-co sparse_ok=true \
-co interleave=band \
-co compress=jpeg -r \
bilinear \
2015-05-20_tandale_merged_transparent_mosaic_group1.tif \
2015-05-20_tandale_merged_transparent_mosaic_group1_tiled_sparse_band_jpeg.tif
(I'm not doing this formally because I think it's already been done before, probably even as part of a previous OAM effort.)
This tiles s3://hotosm-oam/2015-05-20_tandale_merged_transparent_mosaic_group1.tif fully dynamically:
http://ec2-52-2-204-156.compute-1.amazonaws.com/#20/-6.79263/39.24284
As expected, it's dog slow, even when the majority of the file has been cached (by yas3fs
, not manually) on disk. When reading / rendering, yas3fs
is responsible for the majority of CPU usage.
Proof of concept using tessera.