Open drnextgis opened 2 months ago
Here is the configuration we are using:
# GDAL Config
CPL_TMPDIR=/tmp
GDAL_CACHEMAX=75%
GDAL_INGESTED_BYTES_AT_OPEN=32768
GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR
GDAL_HTTP_MERGE_CONSECUTIVE_RANGES=YES
GDAL_HTTP_MULTIPLEX=YES
GDAL_HTTP_VERSION=2
VSI_CACHE=TRUE
VSI_CACHE_SIZE=536870912
MOSAIC_CONCURRENCY=1
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_SESSION_TOKEN=
Based on my tests, it’s clear that MOSAIC_CONCURRENCY
and GDAL_CACHEMAX
have the most significant impact:
| GDAL_CACHEMAX | MOSAIC_CONCURRENCY | Server Side Mosaic | Client Side Mosaic |
|-----------------------------|--------------------------------|--------------------|--------------------|
| 75% (total mem: 16 GB) | 1 | 65s | 6,989s |
| 75% (total mem: 16 GB) | 8 (8 CPUs) | 11s | 6,692s |
| 200 | 8 (8 CPUs) | 40s | 32s |
However, even with the same resources, I was unable to achieve comparable performance for server-side rendering as I did for client-side rendering. Probably I need to try with more resources?
All tests were conducted on the same single tile.
When I set MOSAIC_CONCURRENCY=20
, the server process gets killed.
From what I understand, titiler-pgstac uses mosaic_reader, so I attempted to rewrite the code from my initial message using it (local-rio-tiler.py
):
from rio_tiler.io import Reader
from rio_tiler.mosaic import mosaic_reader
def reader(asset: str, *args, **kwargs):
with Reader(asset) as src:
return src.tile(*args, **kwargs)
img, assets = mosaic_reader(urls, reader, x, y, z, indexes=[1, 2, 3], tilesize=512, threads=8)
~However, it works much more slowly (30 s vs 7 s).~ Using the same environment variables as for titiler-pgstac, it shows similar performance as when using the /searches
endpoint (as expected), but it's still twice as slow compared to the approach mentioned in the initial message of the thread.
$ time python local.py
python local.py 10,11s user 0,10s system 196% cpu 5,201 total
$ time python local-rio-tiler.py
python local-rio-tiler.py 35,21s user 2,09s system 310% cpu 11,996 total
I have a hypothesis that might explain the observed behavior: In the first case, we use ThreadPoolExecutor solely for I/O-bound tasks (retrieving PNG tiles), whereas, in contrast, mosaic_reader
internally uses ThreadPoolExecutor not just for data download but also for reprojecting the data to the tile's CRS and to mosaic assets, which is a CPU-bound task. @vincentsarago what do you think?
If there's anything I can do to help move this issue forward, please let me know. However, at this point, I'm leaning towards believing it's a design problem, and without refactoring of titiler-pgstac/rio-tiler, there may not be much we can do.
I think most of the issue is that you're dealing with a large number of assets (75).
As you mentioned, the way MosaicBackend/rio-tiler is designed is by using Threads to distribute the asset reading. As mentioned in https://cogeotiff.github.io/rio-tiler/mosaic/#smart-multi-threading we're trying to have a smart approach but sadly sometime we can't outsmart the task!
if you're tile need to be composed of more than a couple assets, there is no magic!
That's said I'm always interested to see if we can make rio-tiler/titiler better
We’re trying to replace our current tile generation method (client-side composition using a
/cog
endpoint) with a server-side approach using the/searches
endpoint. However, this new method is up to five times slower in some tests. Is this because the same level of performance can’t be achieved withtitiler-pgstac
for server-side tile composition on the same resources?Here’s a snippet demonstrating the current approach. The issue is that it results in a high number of requests to the web server. We considered switching to
/searches
as a potential improvement, but so far, we haven’t achieved comparable performance.