Tile benchmarking methodology redux

abarciauskas-bgse commented 1 year ago

So far, we have found that time to tile by the underlying code (rio_tiler.XarrayReader) is going to be largely dependent on the chunk size. A chunk size of 3MB seems to be reasonable. For the CMIP6 data, if you store chunks of the full spatial extent for 1 timestep, this is only 3MB of data and performs well at all zoom levels.

Hypothesis: When data is chunked spatially, generating one or 4 tiles at low zoom levels (0 and 1) will require more requests to S3 by the XarrayReader.tile function (to mosaic multiple chunks into a single tile). When data is chunked spatially and generating tiles at high zoom levels, these requests can be parallelized. So chunking is more likely to impact performance at low zoom levels than high zoom levels. So at a certain number of chunks, performance to low zoom levels will degrade and we should recommend pyramids.

Invariants:

data type (assume one, such as float32, but make it configurable)
min zoom (assume 0, but make it configurable)
max zoom (assume 7, but make it configurable)
performance threshold for mean time to tile across tiles (assume a threshold, such as 50ms, but make it configurable)

Variants to determine optimal chunk size and shape

spatial resolution: Vary the chunk size to determine chunk size at which threshold is crossed

Variants to determine max number of chunks before performance to

spatial resolution: at increasing spatial resolutions beyond where data needs to be chunked, where does the min zoom level cross the performance threshold
At this resolution, we recommend pyramiding data. interfaces can swap between the raw and pyramid versions of the datasets.

Tasks:

[x] generate a zarr store at various spatial resolutions (180x360 to ...) with no spatial chunking and test the performance across tiles. Report performance against chunk size and resolution. Assume a consistent data type.
[x] Given a threshold of tile time of X (let's say 50ms to match COG performance) determine at what resolution (and corresponding chunksize) we need to start chunking
[x] Generate zarr stores at increasing spatial resolutions with chunk size matching chunk size from previous step and run tests to determine if tile performance degrades at the minimum zoom after a certain number of chunks

maxrjones commented 1 year ago

Thanks for outlining this @abarciauskas-bgse! Is this mostly based on https://nasa-impact.github.io/zarr-visualization-cookbook/approaches/tiling/cmip6-zarr-tile-server-benchmarks.html#summary? I'm a bit confused by the 50 ms threshold because it looks like the Zarr tests are all slower than that limit. Do you think it's currently possible to match COG performance with titiler-xarray?

abarciauskas-bgse commented 1 year ago

Good catch @maxrjones, that is true. I think you're right that 50ms may be an unreasonable threshold to match, especially since (I think) the pgstac + COG tiling only requires one request to S3 (but also a read from the database) and the Zarr + XarrayReader approach requires 2 requests, at a minimum (this is a theory I need to verify).

The goal is to make the threshold configurable, but probably the example needs to use a threshold that is reasonable. Perhaps a more reasonable goal is to come close enough to GIBS performance, which Ryan Boller graciously provided. I'll omit the details since I'm not sure if they're intended to be public but I think it's safe to say their best average response time was 174ms and the worst was 1010ms. So based on this, perhaps a reasonable threshold to try and match is 200ms +/- 25ms.

maxrjones commented 1 year ago

thanks for the explanation. You may have already been considering this based on having included min_zoom as a configurable parameter, but it could be cool to have a small utility that would calculate the minimum recommended zoom level based on the resolution and chunking for cases in which global visualization would be prohibitively slow but people don't want to generate pyramids.

developmentseed / tile-benchmarking

Tile benchmarking methodology redux #42