yuzutech / kroki

Creates diagrams from textual descriptions!
https://kroki.io
MIT License
2.83k stars 211 forks source link

Improve cache handling, including `ETag` #1083

Open MarcelWaldvogel opened 2 years ago

MarcelWaldvogel commented 2 years ago

This is a spin-off from #1062, trying to improve the caching.

Stated design goals

@Mogztter stated/implied the following design goals, as summarized by me:

Design goal discussion

Caching

Today, assets which only change rarely, are revved, whenever possible to achieve this refresh pattern (reuse cache unless server-side version changes). We cannot really do this here, as the client requests the document already kind of revved based on the request content.

Expires/Last-Modified interaction

With Expires or Cache-Control: max-age (maybe with must-revalidate and stale-while-revalidate/stale-if-error), the client will send an If-Modified-Since request after Expires or max-age. If the server considers the time fresh enough, it will reply with 304 Not Modified, not causing additional processing.

ETag interaction

With ETag and Cache-Control: max-age (again with options), the client will send If-None-Match and either receive the new version or 304 Not Modified. (Unlike what you said, I got the impression that Nginx and friends are using ETags when validating proxy cache entries.)

Two-level revving

Just mentioned for completeness, another option not requiring any caching would be to have all requests to /<type>/<format>/<contents> be redirected to /<server-version>/<type>/<format>/<contents> with a short TTL on the redirect and an infinite (immutable) TTL on the actual contents. However, this would double the requests and therefore the latency for new documents.

Discussion

ETag is the more modern way, but Last-Modified already gives us the same benefits. So I do not think that it is important to change this (with the exception of returning the current build date, as discussed in #1062). However, the process described there for ETag generation could be implemented to get ETag working.

frankie567 commented 8 months ago

That would be a great addition, that could help to better manage caching on client's side. Currently, Kroki correctly sends cache and Etag headers:

curl -s -D - -o /dev/null -XGET 'https://kroki.io/mermaid/png/xxx'
HTTP/2 200 
content-type: image/png
content-length: 23677
expires: Sun, 24 Dec 2023 09:12:10 GMT
last-modified: Sat, 23 Sep 2023 16:57:28 GMT
cache-control: public, max-age=432000
etag: 10.4.0EA5i-8SRRcB14ZFb7MIK5m00

But it doesn't seem to consider conditional requests header like If-None-Match:

curl -s -D - -o /dev/null -XGET -H 'If-None-Match: 10.4.0EA5i-8SRRcB14ZFb7MIK5m00' 'https://kroki.io/mermaid/png/xxx'
HTTP/2 200 
date: Tue, 19 Dec 2023 09:13:56 GMT
content-type: image/png
content-length: 23677
expires: Sun, 24 Dec 2023 09:13:56 GMT
last-modified: Sat, 23 Sep 2023 16:57:28 GMT
cache-control: public, max-age=432000
etag: 10.4.0EA5i-8SRRcB14ZFb7MIK5m00

I would expect in this case the server to give a 304 Not Modified response, letting know the client the output hasn't changed; while saving the resources of the Kroki server by not re-generating the diagram.