StationA / tilenol

Scalable, multi-backend geo vector tile server
MIT License
22 stars 6 forks source link

[RFC] Consider caching data per-layer rather than per-request #25

Closed jerluc closed 3 years ago

jerluc commented 4 years ago

Currently, we cache data using the request path as a key and store the full response body as the value. This has the nice side effect of being very simple to implement and maintain, but comes with its drawbacks:

  1. When a request to /_all/{z}/{x}/{y}.mvt is made, another call to /layer1/{z}/{x}/{y}.mvt will miss the cache, because the cache key is based purely on the request path
  2. When a request to /_all/{z}/{x}/{y}.mvt is made, and a partial failure occurs, not only does the entire request fail, but none of the successful layer responses are cached, meaning a subsequent call would have to recompute the entire response, rather than only the failed responses

To fix these problems, we should consider using something like {layer}/{z}/{x}/{y} as a cache key, and caching individual feature collections per layer response. Then, in the above two scenarios:

  1. When a request to /_all/{z}/{x}/{y}.mvt is made, all layer responses get cached, and another call to /layer1/{z}/{x}/{y}.mvt will hit the cache, because the cache key is based on the layer name
  2. When a request to /_all/{z}/{x}/{y}.mvt is made, and a partial failure occurs, the successful layer responses are cached, meaning a subsequent call would only have to recompute the failed layers
jerluc commented 3 years ago

As a further improvement to this logic, we may even want to encode some content-based hashing of the layer configuration itself as a pseudo version number that is encoded in the cache key, so that layer configuration changes are picked up instantaneously, rather than only at cache entry timeout.