developmentseed / tile-benchmarking

Repo for configuring datasets and tests for benchmarking with a dynamic tiler
https://developmentseed.org/tile-benchmarking
3 stars 0 forks source link

Tile benchmarking methodology redux #42

Closed abarciauskas-bgse closed 1 year ago

abarciauskas-bgse commented 1 year ago

So far, we have found that time to tile by the underlying code (rio_tiler.XarrayReader) is going to be largely dependent on the chunk size. A chunk size of 3MB seems to be reasonable. For the CMIP6 data, if you store chunks of the full spatial extent for 1 timestep, this is only 3MB of data and performs well at all zoom levels.

Hypothesis: When data is chunked spatially, generating one or 4 tiles at low zoom levels (0 and 1) will require more requests to S3 by the XarrayReader.tile function (to mosaic multiple chunks into a single tile). When data is chunked spatially and generating tiles at high zoom levels, these requests can be parallelized. So chunking is more likely to impact performance at low zoom levels than high zoom levels. So at a certain number of chunks, performance to low zoom levels will degrade and we should recommend pyramids.

Invariants:

Variants to determine optimal chunk size and shape

Variants to determine max number of chunks before performance to

Tasks:

maxrjones commented 1 year ago

Thanks for outlining this @abarciauskas-bgse! Is this mostly based on https://nasa-impact.github.io/zarr-visualization-cookbook/approaches/tiling/cmip6-zarr-tile-server-benchmarks.html#summary? I'm a bit confused by the 50 ms threshold because it looks like the Zarr tests are all slower than that limit. Do you think it's currently possible to match COG performance with titiler-xarray?

abarciauskas-bgse commented 1 year ago

Good catch @maxrjones, that is true. I think you're right that 50ms may be an unreasonable threshold to match, especially since (I think) the pgstac + COG tiling only requires one request to S3 (but also a read from the database) and the Zarr + XarrayReader approach requires 2 requests, at a minimum (this is a theory I need to verify).

The goal is to make the threshold configurable, but probably the example needs to use a threshold that is reasonable. Perhaps a more reasonable goal is to come close enough to GIBS performance, which Ryan Boller graciously provided. I'll omit the details since I'm not sure if they're intended to be public but I think it's safe to say their best average response time was 174ms and the worst was 1010ms. So based on this, perhaps a reasonable threshold to try and match is 200ms +/- 25ms.

maxrjones commented 1 year ago

thanks for the explanation. You may have already been considering this based on having included min_zoom as a configurable parameter, but it could be cool to have a small utility that would calculate the minimum recommended zoom level based on the resolution and chunking for cases in which global visualization would be prohibitively slow but people don't want to generate pyramids.