cncf / cncf.io

☁️♮🏛🚧 The CNCF.io WordPress website
https://cncf.io
MIT License
84 stars 38 forks source link

DDOS prep for CNCF and LFEvents #729

Closed cjyabraham closed 1 year ago

cjyabraham commented 1 year ago

In the wake of the DDOS attack on the LF Projects sites last week, we should harden our CNCF and LFEvents sites to make sure we're prepared for any attack. LF IT has done a thorough analysis of the event and left some notes here. The gist of this is getting the CDN to serve over 95% of requests. They recommend implementing the following headers which are more sophisticated than what we currently have.

Cache-control: public, max-age=60, s-max-age=3600, stale-while-revalidate=86400, stale-if-error=604800

They also note that Cerber of Wordfence are not needed and we're better without them. In fact it was the Cerber plugin that was the cause of a massive slowdown during the LF Project attack. Our goal here is to keep the CDN cache strong even when our PHP boxes are getting pounded with DDOS requests and those plugins can't help with that.

cjyabraham commented 1 year ago

Some observations and questions: 1) Is https://www.lfasiallc.cn/ still served by the Alibaba mirror? It doesn't look like it is anymore. In this case, should we set a canonical redirect for it to the https://www.lfasiallc.com/ url so we're not splitting SEO juice, serving duplicate content etc? 2) While s-maxage isn’t explicitly set, my understand is that it should take the max-age value, in this case 18000. So I doubt changing those headers further would really increase the cache rate. I agree, however, that it’d be better to set each independently so that we could have a shorter expiration for remote caches. 3) Does the Pantheon Global CDN even recognize s-maxage? They are currently getting me a full answer on this.

cjyabraham commented 1 year ago

According to Pantheon, the Global CDN does recognize s-maxage. This is documented in this Fastly article.

And here is documentation on setting custom headers on wordpress. See Wordpress tab here https://docs.pantheon.io/cache-control

cjyabraham commented 1 year ago

This has been deployed for CNCF.io. After deployment there was no change in pagespeed scores or the cacheing of secondary page assets. Just the page headers for the main document have been changed, as desired. We should wait a few weeks to assess the cache hit ratio changes.

cjyabraham commented 1 year ago

Events sites have been deployed.

cjyabraham commented 1 year ago

New cache headers are deployed and things seem stable. The headers appear to have had no effect on the cache hit ratios. It'd be good to look into how we can increase those on the events sites. We should observe how the cache hit ratios are affected during periods of high traffic, such as during a Kubecon. Closing this issue as we can do that separately.