CloudCannon / pagefind

Static low-bandwidth search at scale
https://pagefind.app
MIT License
3.48k stars 113 forks source link

Avoid date for cache busting metadata? #180

Open mstewartgallus opened 1 year ago

mstewartgallus commented 1 year ago

On the dedicated search page for my website I would like to preload the JSON entry metadata with an element like

<link type="preload" href="/assets/pagefind/pagefind-entry.json" as="fetch">

There are crossorigin issues I haven't debugged yet.

However, the current code in pagefind use the current time for cache busting the entry metadata.

https://github.com/CloudCannon/pagefind/blob/main/pagefind/src/output/stubs/search.js#L100

You can probably use a hash of the entry metadata or a version number but these would complicate the build a little. You could also inline the entry JSON but again that would be annoying. IIRC it's preferable to put cache busting in the filename and not query string if possible.

Anyhow the project is pretty cozy and works great. 😌

bglw commented 1 year ago

Yeah this date cache bust is a little quirky — but I haven't thought of a better method yet. The entire goal is for Pagefind to be entirely cache-resistant between builds, which is tricky when the hashes of all the indexes change.

The reason this file can't have a cache-busted filename is that requesting it is inlined in the pagefind.js file, which itself might be cached (and currently doesn't change between runs, so can safely be cached). Changing this would require that pagefind.js itself was requested with a hashed filename, which would need to be implemented by the developer integrating Pagefind into a site (or Pagefind would need to start modifying the page to add this hash, which I have avoided so far for performance reasons)

Essentially, to ensure that Pagefind never requests an index chunk that no longer exists, some early point in the chain has to subvert the cache, and the pagefind-entry.json file currently does that job since it's teeny tiny 😅

I'm extremely open to ideas here though, so if you have any creative tricks to make everything cache resistant without the query string I'm all ears 🙂

mstewartgallus commented 1 year ago

That is a harder issue to figure out than I thought initially. I would suggest that this kind of config loading might be better handled by Pagefind UI and the Pagefind JS wrapper is a little finickier. But yeah it wouldn't be an easy change to refactor or anything.

mstewartgallus commented 1 year ago

It occurs to me this sort of thing might be a nice use for import maps and JSON imports although these would require a polyfill.

  1. Polyfill import maps and JSON modules.
  2. JSON import entry metadata.
  3. Dynamically map entry to entry + cache busting date parameter.
  4. Rework build system to map entry to entry + hash.
bglw commented 1 year ago

Those are some cool ideas — though I'm not keen to mainline polyfills — but it's something to keep an eye on for sure