OSGeo / gdal

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
https://gdal.org
Other
4.8k stars 2.51k forks source link

gdal2tiles - Unsure if we are hitting a floor, a workflow issue, or hardware issue. Weird benchmarks... #5746

Closed alturic closed 2 years ago

alturic commented 2 years ago

I'm new to map tiling and the somewhat extensive testing we've been benchmarking, I've either hit a floor (unlikely imo) or I'm doing something extremely wrong. Server specs are 3.7 ghz/8 threads, 16GB ram, SSD. Also tested on same cpu, 32GB ram, with an NVMe thinking there may have been a disk/memory bottleneck.

Through our testing, we're trying to generate zoom levels 5-10 only and it's taking roughly ~40 seconds for the gdal2tiles generation. It seems almost counter-intuitive but running gdal2tiles on 4 threads is only ~5-8 seconds slower than if it was run on 8 threads. So we're trying to find out where the bottleneck is, or if we are somehow hitting a floor of just how fast we can the script can process the below image. I have a hard time believing we're hitting a floor as there are many tile providers doing what we are doing up to zoom levels 19 even but, again being new to map tiling, outside of hardware I can't seem to understand where our below workflow (speed) can be improved.

I have a PNG, 14000x10800 and my workflow is currently below. The first two steps complete very fast, but the tile generation time is the issue.

gdal_translate -co "TILED=YES" -of Gtiff -a_ullr -128 58 -65 20 -a_srs EPSG:4326 [in-png] [out-tiff]

and then

gdalwarp -wm 2048 -s_srs EPSG:4326 -t_srs EPSG:3857 -ts 14000 10800 [in-tiff] [out-reprojected-tiff]

and finally

/usr/bin/gdal2tiles.py -s EPSG:3857 --processes=8 -r near -p \"mercator\" -z 5-10 [in-reprojected-tiff] [out-tile-dir]

If I do zoom 5-9 the ~40s goes to ~10s, so I'm wondering if there is some sort of interpolation happening on the tile generation (due to the image size) and whether or not there is some sort of way to not have the interpolation happen - if that is the issue.

I also noticed if we change the tilesize to 128 the process takes ~10 seconds, and 64 ~5 seconds, which also seemed counter-intuitive.

The real question (and I hate to pose it as a general question, but I'm pulling my hair out) is what, if anything, is the "bulk" of the work gdal2tiles is doing and whether or not we're hitting a floor in just how fast it can process or not.

jratike80 commented 2 years ago

Your question makes sense but obviously you did not read our notice

Questions should go to the gdal-dev mailing list at https://lists.osgeo.org/mailman/listinfo/gdal-dev or other support forums. GitHub issues are for bug reports and suggestions for new features.

jratike80 commented 2 years ago

When you send mail to gdal-dev please tell also what is your GDAL version. When it comes to tile size and speed, did you notice that this command gdalwarp -wm 2048 -s_srs EPSG:4326 -t_srs EPSG:3857 -ts 14000 10800 [in-tiff] [out-reprojected-tiff] creates a striped TIFF file and for 256 sized tiles gdal2tiles must read 256 lines, each having 14000 pixels? I suggest to make tests also with tiled input files.

alturic commented 2 years ago

@jratike80 Thanks, I'll look into that and see if there's anything we can tweak in gdal_translate/gdalwarp. I will say, whether we use a 7000x5160 or 28000x21600 (or the current 14000x10800) there seemed to be minimal difference in tiling time.

As to the mailing list, I did subscribed, as well as send an email but I don't see anything on the list, nor did I receive any emails. Like I said, I've been pulling my hair out so I do apologize for opening a ticket here.