keycdn / cache-enabler

A lightweight caching plugin for WordPress that makes your website faster by generating static HTML files.
https://wordpress.org/plugins/cache-enabler/
123 stars 46 forks source link

Exclusion does not work #244

Closed tobias992 closed 3 years ago

tobias992 commented 3 years ago

Hi, on settings page i see that "ref" arguments are excluded by default. But it does not work. A manual add also not work. For me, its not possible to exclude the ref argument - so other plugins don't work.

Maybe you can check this.

coreykn commented 3 years ago

I'm assuming you're referring to the Query Strings exclusion. The default regular expression is saying to not bypass the cache for a query string that contains ref. When running a few tests on version 1.7.2 I'm unable to replicate any issues:

# do not bypass cache
curl -s -D - -o /dev/null https://www.example.com
HTTP/2 200
server: nginx/1.17.10
date: Fri, 04 Jun 2021 17:47:13 GMT
content-type: text/html; charset=UTF-8
vary: Accept-Encoding
x-cache-handler: cache-enabler-engine

# bypass cache
curl -s -D - -o /dev/null https://www.example.com/?query=string
HTTP/2 200
server: nginx/1.17.10
date: Fri, 04 Jun 2021 17:47:33 GMT
content-type: text/html; charset=UTF-8
vary: Accept-Encoding
link: <https://www.example.com/wp-json/>; rel="https://api.w.org/"

# do not bypass cache
curl -s -D - -o /dev/null https://www.example.com/?ref=123
HTTP/2 200
server: nginx/1.17.10
date: Fri, 04 Jun 2021 17:47:39 GMT
content-type: text/html; charset=UTF-8
vary: Accept-Encoding
x-cache-handler: cache-enabler-engine

If you'd like ref to bypass the cache then you would want to remove it from the default regular expression for example:

/^(?!(fbclid|mc_(cid|eid)|utm_(source|medium|campaign|term|content|expid)|gclid|fb_(action_ids|action_types|source)|age-verified|usqp|cn-reloaded|_ga|_ke)).+$/
tobias992 commented 3 years ago

Okay, but then the translation is wrong. The actual text means, that all these query strings...

/^(?!(fbclid|ref|mc_(cid|eid)|utm_(source|medium|campaign|term|content|expid)|gclid|fb_(action_ids|action_types|source)|age-verified|usqp|cn-reloaded|_ga|_ke)).+$/

...bypass the cache. And there is the "ref" inside. So for me that means, a query string with "ref" bypass the cache. And this is what i want, but it dont bypass.

tobias992 commented 3 years ago

Okay i tested your regular expression line - works.

But then this setting is wrong in plugin. Because the text means that ref is bypassed by default... but its not.

At the moment this setting is "turned" - all default strings are not bypassed.

coreykn commented 3 years ago

Cache exclusions indicate what should be excluded from the cache. The Query Strings setting allows a regular expression to be provided that will allow a pattern to be checked against the query string. If a match is found it will be excluded.

The default query string pattern says to match everything other than what follows due to the negative lookahead. That means the default regular expression is excluding any query string by default unless it contains the strings in the regular expression (e.g. fbclid, ref, etc.). For example, the default regular expression says to have the cache be bypassed when https://www.example.com/?query=string is requested but not when https://www.example.com/?ref=123 is requested.

An understanding of how regular expressions work is needed. Due to this requirement I intend on introducing an easier to use cache exclusion engine some day. That will make handling the cache exclusions much better than the current way.