bdkjones / CodeKit

CodeKit 3 Issue Tracker
https://codekitapp.com
82 stars 5 forks source link

exclude canonical links from cache busting #629

Closed davidt-de closed 1 year ago

davidt-de commented 3 years ago

I stumbled across this request under a already closed issue, while googling a solution for my problem, any thoughts on this? It would be very helpful. At least for me.

[...] I think it should be nice to exclude other links as for example canonical links:

<link rel="canonical" href="https://www.mysite.com/canonical?ckcachebust=575978862">

Thanks!

Originally posted by @nye in https://github.com/bdkjones/CodeKit/issues/409#issuecomment-479428020

bdkjones commented 3 years ago

I can expose an option to provide exclusions for the cache-buster. My absolute #1 goal with the cache-buster was to be insanely fast. As such, the algorithm does not actually parse the HTML; it simply rips through the text of the file and finds links. The whole thing is written in straight C.

While that's definitely the fastest possible way to do this, it does eliminate options that fully parsing the page would yield, such as inspecting other attributes of a link (e.g. rel="canonical").

To keep the speed but still provide control, the best approach is to expose a list of possible exclusions.

davidt-de commented 3 years ago

Hi Bryan,

the list would be a great option!

Thanks George

jamiedumont commented 1 year ago

Just wanting to add my +1 to this issue, despite it being old. I'm stuck between adding hashes to canonicals or not using the cache buster at the moment.

Is the exclusion list for cache-busting a task you haven't gotten round to yet, or one that isn't worth the time?

bdkjones commented 1 year ago

Well, it’s definitely an edge case. RegEx matching is godawful slow and the cache buster is currently written in plain C so that it’s as fast as possible. It doesn’t parse the HTML or create a DOM or even worry about Unicode (because no Unicode code points above ASCII are valid for URIs.) I can add exceptions, which will trade off pure speed. But why not just exempt the entire file from cache-busting and then purge the cache at the CDN layer once you deploy? What are we trying to exempt from cache-busting?

jamiedumont commented 1 year ago

Hey @bdkjones!

I was using cache busting on HTML files for URIs to CSS and JS files, but was getting my canonical tag cache busted which made a mess of SEO, etc.

I was previously deploying to a plain nginx server, so needed cache-busting within my build step. I'm now using a CDN, which as you say solves the cache-busting problem.

I understand the desire for outright speed on a feature like this, but surely any page that needs to bust URIs to CSS and JS almost always include a canonical link that gets clobbered too?

bdkjones commented 1 year ago

Okay. It had been a hot second since I'd written any C, so I decided to knock this out. CodeKit will now skip cache-busting for any link tag with rel="canonical":

I'll release this update soon.

Screenshot 2023-08-07 at 22 13 08
jamiedumont commented 9 months ago

Sorry, I’ve somehow missed this! Thanks so much!