napalm-automation / napalm-automation.github.io

The NAPALM website
https://napalm-automation.net
2 stars 7 forks source link

Too aggressive cache control #37

Open ogenstad opened 7 years ago

ogenstad commented 7 years ago

The cacheing on the site is set quite high, it looks like all content is stored the browser for over a week. Don't know what would be optimal. Currently the pages needs a hard reload in order for users to see new content.

I think at least for the root https://napalm-automation.net and https://napalm-automation.net/news it could be set a lot lower. Users shouldn't need to use Command (or Control) + R to see new posts. For things under /assets it could be set a lot higher, especially for javascript and css as there's some kind of cache busting when needed for those (https://github.com/napalm-automation/napalm-automation.github.io/blob/master/_data/cache.yml)

What options do we have to set different expirations to different sections of the site?

~  ᐅ curl -svo /dev/null https://napalm-automation.net
* Rebuilt URL to: https://napalm-automation.net/
*   Trying 104.20.63.104...
* TCP_NODELAY set
* Connected to napalm-automation.net (104.20.63.104) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
* Server certificate: ssl748784.cloudflaressl.com
* Server certificate: COMODO ECC Domain Validation Secure Server CA 2
* Server certificate: COMODO ECC Certification Authority
> GET / HTTP/1.1
> Host: napalm-automation.net
> User-Agent: curl/7.51.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Fri, 21 Jul 2017 05:16:50 GMT
< Content-Type: text/html; charset=utf-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< Set-Cookie: __cfduid=d396ffb03b2775cd7d74cc2b222c2141f1500614210; expires=Sat, 21-Jul-18 05:16:50 GMT; path=/; domain=.napalm-automation.net; HttpOnly
< Last-Modified: Thu, 20 Jul 2017 19:04:10 GMT
< Access-Control-Allow-Origin: *
< Expires: Sat, 29 Jul 2017 05:16:50 GMT
< Cache-Control: public, max-age=691200
< X-GitHub-Request-Id: F276:6626:32F3F79:45A84F9:59710412
< Via: 1.1 varnish
< X-Served-By: cache-bma7034-BMA
< X-Cache: MISS
< X-Cache-Hits: 0
< X-Timer: S1500578835.855828,VS0,VE111
< Vary: Accept-Encoding
< X-Fastly-Request-ID: 00770c5b53f14a4f05e648d3d54c16f36c5357dc
< CF-Cache-Status: HIT
< Server: cloudflare-nginx
< CF-RAY: 381bb0c26d353d31-CPH
<
{ [570 bytes data]
* Curl_http_done: called premature == 0
* Connection #0 to host napalm-automation.net left intact
~  ᐅ
mirceaulinic commented 7 years ago

Yes, I have configured Browser cache and Edge cache to 8 days, at the highest caching level. Let me know if you want to decrease (and how much), and, of course, we can have separate caching instructions per URL patterns.

ogenstad commented 7 years ago

I guess it mostly comes down to how often someone will write something for the site. But if new information is added and people are linked to the site I don't think people will count on the fact that they have to refresh the page to see new information. Though for blog posts they will be linked directly to a new page so in that case it won't matter.

What do you think about which settings to use?

dbarrosop commented 7 years ago

What we need is to proactively purge cached content every time we merge something into master. Can we do that instead?

ogenstad commented 7 years ago

We already purge do that, the job on Travis purges the cache when something is committed to master.

However that is the cache at Cloudflare. I'm talking about the local cache in the browser. The only way to purge that is to use a different url, such as is done when the css and javascript files change. Other than that the only way would be to set a lower expiration time. This is of course only a problem for returning visitors.

dbarrosop commented 7 years ago

Can we just disable the browser cache? I thought having to ctrl-shift-R was dead.

mirceaulinic commented 7 years ago

Okay, browser cache disabled...

ogenstad commented 7 years ago

Actually I think browser cache should be enabled on the /assets and /images folders. For /images it could probably be two hours just so that it loads faster if someone is browsing between pages. For the /assets folder it could be much longer since we have another way of purging those, it could be a week like it is now or even a month.

mirceaulinic commented 7 years ago

Then we can add a simple rule to cache these paths for X days / weeks / months. Just let me know whatever you decide :-)

ogenstad commented 7 years ago

I would say:

1 day - /images 2 months - /assets 0 cache for everything else.

dbarrosop commented 7 years ago

2 months? I am completely against telling the browser what to do with their cache but if we are going to to do it let's do it in a sane manner and not have more than 24 hours. 2 months for /assets would mean we break posts that rely on minor changes to the js or the css files.

ogenstad commented 7 years ago

The caching is done by url and the code appends an id the files under the assets directory. I.e like this:

<link rel="stylesheet" href="/assets/plugins/bootstrap/css/bootstrap.min.css?cache_id=201707131435">
<link rel="stylesheet" href="/assets/css/napalm.css?cache_id=201707131435">
<link rel="stylesheet" href="/assets/plugins/animate.css?cache_id=201707131435">
<link rel="stylesheet" href="/assets/plugins/line-icons/line-icons.css?cache_id=201707131435">
<link rel="stylesheet" href="/assets/plugins/font-awesome/css/font-awesome.min.css?cache_id=201707131435">
<link rel="stylesheet" href="/assets/plugins/owl-carousel/owl-carousel/owl.carousel.css?cache_id=201707131435">
<link rel="stylesheet" href="/assets/plugins/layer-slider/layerslider/css/layerslider.css?cache_id=201707131435">

Currently this comes from the https://github.com/napalm-automation/napalm-automation.github.io/blob/master/_data/cache.yml file, so it would require that the id is changed. This could be handled automatically in the future with jekyll-assets, cut currently GitHub Pages doesn't support that plugin.

So even if the cache would be set to 2 months it would be easy to purge it if there was a need.

From Google's PageSpeed Insights documentation https://developers.google.com/speed/docs/insights/LeverageBrowserCaching

We recommend a minimum cache time of one week and preferably up to one year for static assets, or assets that change infrequently. If you need precise control over when resources are invalidated we recommend using a URL fingerprinting or versioning technique - see invalidating and updating cached responses link above.

I am completely against telling the browser what to do with their cache

I guess this comes down to taste or politics, but I only see it as a way of speeding up the user experience.

dbarrosop commented 7 years ago

Ok, makes sense. That means we can set the expiration time by URL on the CDN side as well, we don't have to limit it to the local browser cache. We could even have travis update that value if there is a change inside any of those folders. This should work:

[ ! -z "$(git diff master... -- assets images)" ] && code_to_update_the_cache_id_goes_here
ogenstad commented 7 years ago

That means we can set the expiration time by URL on the CDN side as well, we don't have to limit it to the local browser cache.

Not quite sure what you mean by this? :)

code_to_update_the_cache_id_goes_here

Would this involve updating the cache.yml file and pushing to master / sending a PR, or are you thinking about something else?

dbarrosop commented 7 years ago

Not quite sure what you mean by this? :)

I meant this: " I have configured Browser cache and Edge cache to 8 days". If we can do versioning and purge content we can have higher times, we don't have to limit ourselves to the local cache of the browser.

Would this involve updating the cache.yml file and pushing to master / sending a PR, or are you thinking about something else?

Yeah, it could be a simple shell script that calculates a new timestamp and pushes the change automatically. No need to create a PR. We just have to make sure we don't start and endless loop of "push cache_id", "trigger CI", "push cache_id", "trigger CI"...

ogenstad commented 7 years ago

So should we just raise the age time for /images and /assets again and leave the rest of the site as is?

Then we can handle that shell script as another issue.

dbarrosop commented 7 years ago

LGTM