spiritedmedia / systems

Code and documentation for building, deploying, and serving code.
1 stars 1 forks source link

Add the ability to filter query args from the cache key to improve the cache HIT rate #54

Closed kingkool68 closed 5 years ago

kingkool68 commented 5 years ago

There are certain query string arguments we want to ignore it terms of caching. For example, this URL https://theincline.com/2018/07/18/this-state-senator-wants-pennsylvania-to-ban-the-outdated-gay-and-trans-panic-defense/?mc_cid=5035471bd9&mc_eid=b31feb6b33 we can ignore the mc_cid and mc_eid query arguments since those args don't affect how the page is rendered.

To do this we need to preprocess the cache key. The SRCache nginx module even has a section in the README about preprocessing the cache key: https://github.com/openresty/srcache-nginx-module#cache-key-preprocessing

This is the nginx part of https://github.com/spiritedmedia/spiritedmedia/issues/2804

What query strings can we ignore?

Looking at Google Analytics can show us which query args are most frequently used:

We definitely can't ignore the search query, s, as that modifies what is rendered on the page.

Update /etc/nginx/sites-available/<domain-name>

We need to extract the path from the $request_uri variable and set a new variable. nginx doesn't allow modifying the $request_uri variable directly (which makes total sense)

The <domain-name> value is different depending on the enviornment: spiritedmedia.dev, staging.spiritedmedia.com, spiritedmedia.com This code needs to be run in the http context.

# Extract the path from $request_uri and store it in a new variable $request_uri_path
# See https://stackoverflow.com/a/43749234/1119655
map $request_uri $request_uri_path {
  "~^(?P<path>[^?]*)(\?.*)?$"  $path;
}

Update /etc/nginx/common/redis-php7-modified.conf

This is the new configuration so we can ignore certain query strings.

# Modified Redis NGINX CONFIGURATION
set $skip_cache 0;
# POST requests and URL with a query string should always go to php
if ($request_method = POST) {
  set $skip_cache 1;
}

# We want to cache requests with query strings
#if ($query_string != "") {
#  set $skip_cache 1;
#}

# Don't cache URL containing the following segments
if ($request_uri ~* "(/wp-admin/|wp-.*.php|index.php|sitemap(_index)?.xml|[a-z0-9_-]+-sitemap([0-9]+)?.xml)") {
  set $skip_cache 1;
}

# Don't use the cache for logged in users or recent commenter or customer with items in cart
if ($http_cookie ~* "comment_author|wordpress_[a-f0-9]+|wp-postpass|wordpress_no_cache|wordpress_logged_in|woocommerce_items_in_cart") {
  set $skip_cache 1;
}

# Strip certain query args by setting them to null so they can be ignored in the cache key 
# Improves our cache HIT rate
rewrite_by_lua '
  local args = ngx.req.get_uri_args()
  args.refresh = nil
  args.mc_cid = nil
  args.mc_eid = nil
  args.fbclid = nil
  args.utm_source = nil
  args.utm_medium = nil
  args.utm_campaign = nil
  ngx.req.set_uri_args(args)
';

# Use cached or actual file if they exists, Otherwise pass request to WordPress
location / {
  try_files $uri $uri/ /index.php?$args;
}

location /redis-fetch {
    internal;
    set  $redis_key $args;
    redis_pass  redis;
}

location /redis-store {
    internal;
    set_unescape_uri $key $arg_key ;
    redis2_query  set $key $echo_request_body;
    redis2_query expire $key 14400;
    redis2_pass  redis;
}

location ~ \.php$ {
  # Force $request_uri_path to lowercase
  # See https://stackoverflow.com/a/48060609/1119655
  set_by_lua $request_uri_path "return ngx.arg[1]:lower()" $request_uri_path;  

  # $request_uri_path is defined in the http context in nginx.conf
  set $key "nginx-cache:$scheme$request_method$host$request_uri_path?$args";
  try_files $uri =404;

  srcache_fetch_skip $skip_cache;
  srcache_store_skip $skip_cache;

  srcache_response_cache_control off;

  set_escape_uri $escaped_key $key;

  srcache_fetch GET /redis-fetch $key;
  srcache_store PUT /redis-store key=$escaped_key;

  more_set_headers 'X-SRCache-Cache-Key $key';
  more_set_headers 'X-SRCache-Fetch-Status $srcache_fetch_status';
  more_set_headers 'X-SRCache-Store-Status $srcache_store_status';

  include fastcgi_params;
  fastcgi_pass php7;
}