developmentseed / titiler-cmr

Dynamic tiles from CMR queries
MIT License
5 stars 0 forks source link

ODD 24.4 Objective 2: titiler-cmr benchmarking + performance testing design #24

Open abarciauskas-bgse opened 1 month ago

abarciauskas-bgse commented 1 month ago

earth.gov wishes to use titiler-cmr in their production site to visualize datasets listed in this spreadsheet. Titiler-cmr has only been tested with some basic use cases. We need to understand what the limitations of the API are (such as data type), document them and propose monitoring or performance enhancements as appropriate.

Design considerations:

@vincentsarago @sharkinsspatial do you have other suggestions for what we should be testing for to make sure titiler-cmr is ready for production use?

Acceptance criteria:

abarciauskas-bgse commented 1 month ago

Looking at GIBs metrics it looks like load is 30-50 million requests a day which is on the order of 500 req/second. One idea is just to report on tile time to load for datasets MUR SST and GPM IMERG at that level of load and to determine if that is satisfactory performance for production. We should also make note of:

sharkinsspatial commented 1 month ago

@abarciauskas-bgse Given this request volume it might also be worthwhile investigating an edge caching strategy for the tiler deployment. This both

  1. Reduces our overall load/dependency on CMR.
  2. Reduces the effect of datasets with poor tiler performance to a smaller set of users (depending on cache eviction strategies).

For this effort I'd love to see us

  1. Understand the various data characteristics, code and configuration settings that contribute to rendering performance so that we can reduce tile rendering latency to an absolute minimum.
  2. Understand the likely tile request usage patterns for the proposed datasets so that we can plan the best edge caching and cache retention strategy.
  3. Do some rough cost analysis of edge cache usage vs naive re-rendering. Though we know that edge caching will always be the most performant solution, we don't know much about the cost implications of cache management at this scale.