Closed Malvoz closed 1 year ago
Thanks for your suggestion, @Malvoz!
I would like to raise this again because the header provides finer control than expires.
Cache-Control
is already used by Apache with the ExpireXX
directives.
Also with the addition of the immutable directive
Generally I'm in favor to follow new standards, but here I'm concerned by the potential downsides of the immutable
directive:
Header
, which comes with a room for bad configurations.Real infinite caching without revalidation must be used with care: Should not be used without SSL/TLS level. Must be used with really-definitive or really-well-managed files, the user must know what it means. This is not so trivial.
Yes good catch, it could be commented out with notes on TLS/SSL. The web is moving towards an "HTTPS first" web and there are other HTTP header fields that indeed require HTTPS. I would be surprised if H5BP does not move to an HTTPS-first approach in the future with HTTP configurations commented out instead.
immutable
is relatively new and support for it does not cover all major browsers yet. But the fact that cache-control
rolls out new directives, IMO speaks in favor for it.
Are there equivalent approaches of expires
to all cache-control
's directives?
Are there equivalent approaches of expires to all cache-control's directives?
No, but Expires header is added by Apache automatically for backward compatibility only.
Here's also a nice read about Expires header vs Cache-Control and why Expires header is deprecated... https://www.fastly.com/blog/headers-we-dont-want
Just to be clear here: Cache-Control
is already the preference in the config. Expires
is added by Apache, not explicitly by the config.
I guess I was confused by "Expires" and the "ExpiresActive" setting...
@LeoColomb
Generally I'm in favor to follow new standards, but here I'm concerned by the potential downsides of the immutable directive:
... we need to use Header, which comes with a room for bad configurations.
The only benefit I see using mod_expires
is that you can ExpiresByType <media type>
which seems impossible using cache-control
? Instead you need to FilesMatch
every potential file which may be error prone. Is that what you are referring to?
... Must be used with really-definitive or really-well-managed files, the user must know what it means. This is not so trivial.
So unless I'm aware of that fact (I realize there is a note on this), this is already an issue with:
Self quote:
The only benefit I see using mod_expires is that you can
ExpiresByType <media type>
which seems impossible usingcache-control
? Instead you need toFilesMatch
every potential file which may be error prone.
Maybe you could do something like:
Header set Cache-Control "<VALUE>" "expr=%{CONTENT_TYPE} =~ m#<MEDIA TYPE>|<MEDIA TYPE>#"
Now that this issue is about immutable
- I've been looking into filename-based_cache_busting.conf and there are things I suggest to adress:
The snippet currently does not set any directives for caching which is kind of the point of versioning files. The optimal strategy would be to serve these files with both immutable
and a long max-age
as fallback for browsers that don't understand the immutable
directive.
This advice is quite outdated:
In 2008 Steve Souders wrote about Squid not caching resources with query string parameters. But it's been around 10 years since Squid changed that behavior: http://www.squid-cache.org/Versions/v2/2.7/RELEASENOTES.html#s1
The default rules to not cache dynamic content from cgi-bin and query URLs have been altered. Previously, the "cache" ACL was used to mark requests as non-cachable - this is enforced even on dynamic content which returns cachability information. This has changed in Squid-2.7 to use the default refresh pattern. Dynamic content is now cached if it is marked as cachable [...]
Friendly bump :)
The immutable
directive is really beneficial in terms of performance. More info on that here:
And it's backwards compatible, browsers that don't understand it just ignores it and uses max-age
instead.
Perhaps we can set an environment variable at:
and respond to request within that environment with:
<IfModule mod_headers.c>
Header merge Cache-Control "immutable, max-age=31536000"
</IfModule>
Now, I'm not comfortable with apache env variables so if you agree with this, you can PR or help me set it up :)
Thanks @Malvoz. I'm ready to go. Thoughts @XhmikosR?
OK, we can start thinking of an implementation.
Webhint suggests the following:
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Where needed add `immutable` value to the `Cache-Control` header
<IfModule mod_headers.c>
# Because `mod_headers` cannot match based on the content-type,
# the following workaround needs to be done.
# 1) Add the `immutable` value to the `Cache-Control` header
# to all resources.
Header merge Cache-Control immutable
# 2) Remove the value for all resources that shouldn't be have it.
<FilesMatch "\.(appcache|cur|geojson|ico|json(ld)?|x?html?|topojson|xml)$">
Header edit Cache-Control immutable ""
</FilesMatch>
</IfModule>
As we already did with other conditional headers, we may use MIME type expressions instead. (and as you suggested)
Header merge Cache-Control "immutable" "expr=%{CONTENT_TYPE} =~ m#<MEDIA TYPE>|<MEDIA TYPE>#"
Perhaps we can set an environment variable at
I don't feel confortable adding environment variables. Hard to understand when they are evaluated, hard to debug.
- This advice [cache-busting with hash in filenames] is quite outdated
This is a different issue, but you are right. That said it can be hard to have a strong configuration on proxies or CDN when using query string. To be honest I don't have any precise opinion on this, except that webpack still use the hash-in-name template by default, if I'm correct.
Webhint suggests the following:
As an aside, I've already opened an issue at webhint about Apaches ability to match based on content-type. The example also seems to have syntax errors, and they should use a long max-age
as fallback too, I can take these things up with them.
In the following example, I'm matching against every file that is not text/html
and has v=
in a query string.
Header set Cache-Control "max-age=31536000, immutable" "expr=%{QUERY_STRING} =~ m#v\=#i && %{CONTENT_TYPE} !~ m#text/html#i"
This would match e.g. /app.css?v=1.0.0
.
To meet your want/requirement of having file-name based matching, can we then just apply some regex for %{REQUEST_FILENAME}
instead of %{QUERY_STRING}
to the example above?
This advice [cache-busting with hash in filenames] is quite outdated
This is a different issue, but you are right. That said it can be hard to have a strong configuration on proxies or CDN when using query string.
I'm yet to find any up-to-date sources to verify proxies/CDNs having issues with query strings in the modern web (again, Squid introduced caching of query strings as a default in 2008~). But perhaps I haven't searched hard enough. ^^
In the following example
Let's start with MIME-type only first. We'll see cache busting later.
And I think we should prefer merging over setting Cache-Control header to add the immutable
attribute.
But perhaps I haven't searched hard enough.
Lack of feature or correctness is never documented. 😆
I think we should prefer merging over setting Cache-Control header to add the immutable attribute.
I overlooked that in the example. However I don't think merge
is good enough either, in section 2.1, RFC 8246:
[...] proxies SHOULD skip conditionally revalidating fresh responses containing the immutable extension unless there is a signal from the client that a validation is necessary (e.g., a no-cache Cache-Control request directive defined in Section 5.2.1.4 of [RFC7234]).
Although I don't know why a developer would, but in any case a developer uses no-cache
or perhaps no-store
with versioned files then immutable
(and max-age
) would be ignored.
Revisiting this; reusing the same MIME-types as used in filename-based_cache_busting.conf
(except for .webmanifest
, since it shouldn't be versioned) to match the same cache-busting pattern:
<IfModule mod_headers.c>
Header set Cache-Control "max-age=31536000, immutable" "expr=%{REQUEST_URI} =~ m#^(.+)\.(\w+)\.(bmp|css|cur|gif|ico|jpe?g|m?js|a?png|svgz?|webp)$#i"
</IfModule>
/cc @LeoColomb
A self-reminder to look into this more, while the example above would make sure that other directives (such as no-cache
and no-store
) are overridden for versioned files per the regex - which is necessary to preserve the behavior of long max-age
and immutable
(as described in https://github.com/h5bp/server-configs-apache/issues/148#issuecomment-519946513), this would also override no-transform
, it shouldn't...
Q: do transcoding intermediaries (proxies and others) only require Cache-Control
to be sent for the document (text/html
)? If so then this is not an issue, as immutable
shouldn't be specified for HTML resources (and the proposed regex doesn't look for HTML).
Not sure if answer lies somewhere in
https://www.w3.org/TR/ct-landscape/
https://www.w3.org/TR/ct-guidelines/
https://support.google.com/webmasters/answer/6211428?hl=en says (emphasize mine):
Opting out of Web Light If you do not want your pages to be transcoded, set the HTTP header "Cache-Control: no-transform" in your page response. If Googlebot sees this header, your page will not be transcoded.
Edit: I guess this could be solved by proper ordering in .htaccess, setting the Header merge
of Cache-Control: no-transform
after immutable... @LeoColomb is ordering of config snippets bad to rely on? Does H5BP do that already?
Does H5BP do that already?
In a way to get things working yes, but the perfect order is mostly impossible. Anyway, we can review the order if it helps.
The
cache-control
header (which takes precedence overexpires
if present) has been asked about before in #85 and #73.I would like to raise this again because the header provides finer control than
expires
. Also with the addition of theimmutable
directive (see blog posts 1, 2, 3), we get a performance benefit but also no longer have to set longmax-age
directives for infinite caching.