Open XhmikosR opened 5 years ago
this shouldn't result in any collisions I think?
If that was true, why are they so long in the first place?
So, I agree that for certain uses of this it would be fine (cache-busting), so we should add a length
or something option to truncate it.
I meant for this specific use case :)
Thanks for considering this.
If this is about Subresource Integrity hashes please keep in mind, that the standard for the JS and CSS links expects SHA384 for the integrity hashes.
See https://www.w3.org/TR/SRI/
"Cache busting" is a new use for me ;) I always used date strings or random strings for that.
Yes, we're not messing with the SRI hashes.
If this is about Subresource Integrity hashes please keep in mind, that the standard for the JS and CSS links expects SHA384 for the integrity hashes.
I took that as SHA384 is the only digest allowed. In case others think the same, the relevant bits of the specification are:
Conformant user agents MUST support the SHA-256, SHA-384 and SHA-512 cryptographic hash functions for use as part of a request’s integrity metadata and MAY support additional hash functions.
At the time of writing, SHA-384 is a good baseline.
Please consider the case where cache-busting is not required but the integrity hash is still wanted, i.e. a length
of 0 but without style..css
being generated.
@RalphCorderoy you can do like this:
{{ $fingerprinted := $css | fingerprint }}
<style src="{{ $css.RelPermalink }}" integrity="$fingerprinted.Data.Integrity" }}
voting for this for cache-busting
With the recent introduction of resources.Copy
and crypto.FNV32a
you can do:
{{ with resources.Get "sass/main.scss" }}
{{ with $css := . | toCSS | minify | fingerprint }}
{{ with resources.Copy (printf "css/style.%d.min.css" (crypto.FNV32a .Content)) . }}
<link rel="stylesheet" href="{{ .RelPermalink }}" integrity="{{ $css.Data.Integrity }}" crossorigin="anonymous">
{{ end }}
{{ end }}
{{ end }}
Which renders to something like:
<link rel="stylesheet" href="/css/style.87871251.min.css" integrity="sha256-FcQqt3aNlV7AZnGV4zkQRVeCeJOxbMPnQSx258L803E=" crossorigin="anonymous">
{{ with resources.Copy (printf "css/style.%d.min.css" (crypto.FNV32a .Content)) . }}
Yea, that works, but you end up recalculating that same hash over and over again -- a common thing with these resources is that they're included in "every" page.
Yeah, I guess partialCached
only gets you most of the way there.
Yeah, I guess partialCached only gets you most of the way there.
Yea well, most of the way is good enough in this scenario, I guess. I have a fix for partialCached in the pipeline to make it only "run once" for a given key.
Also, the FNV32a
is very fast.
Or maybe just taking the first X characters from the original Data.Integrity
, something along these lines?
https://embeddedartistry.com/blog/2017/07/05/printf-a-limited-number-of-characters-from-a-string/
Using the un-minified bootstrap.css as the resource, with 100 pages, with a cold cache for each test...
With FNV32a
cache buster as described above:
Without the cache buster:
The difference is noise.
most of the way is good enough in this scenario
I agree.
So, the resources.Copy
approach fails if you PostProcess the resource prior to copying. Which I think makes sense, but makes my head hurt.
So, the resources.Copy approach fails if you PostProcess the resource prior to copying. Which I think makes sense, but makes my head hurt.
Yea, that's definitively not supported with how PostProcess
currently works. But Isn't this a ... vanity issue?
Is there a way to use the first few characters of the hash for the Filename, not SRI. Cloudflare will not deploy the site as URLs are > 100 chars long. Even with Sha256, main.css will be 86 chars long. So, we're forced to use MD5 hash, hence affecting security.
"Error: Failed to publish your Function. Got error: Error 8000057: Rule (%s) in
routes.json
is over the 100 character limit. Refer to https://cfl.re/3FsE4aF."
If we could truncate the filename hash to 10 chars or so, it should be enough for cachebusting, while allowing us to use more secure SRI Hashes.
I suppose truncating this should suffice. Happy to raise a PR if the solution is okay.
Adding context from my use case and an issue with current workarounds when typescript resources are renamed by js.Build
.
I have a typescript file I want to compile to javascript, minify, add an SRI hash, and cache-bust with a reasonably short filename.
Given an input like main.ts
, I'm looking to get this output
<script src="/main-G7Cf3v.js" integrity="sha256-..." crossorigin="anonymous"></script>
js.Build
will rename the file from main.ts
to main.js
and will update .RelPermalink
and .Permalink
, but not .Name
. This means I don't have a way to get the final extension without also publishing the resource. The strategy I'm currently using below mostly works, but it will end up with src="/main-G7Cf3v.ts"
with .ts
instead of .js
.
{{ with resources.Get "main.ts" }}
{{ $resource := . | js.Build (dict "minify" true) | fingerprint }}
{{ $fingerprint = $resource.Data.Integrity }}
{{ $hash := index (index (findRESubmatch `sha\d*-(.*)` $fingerprint) 0) 1 }}
{{ $shortHash := $hash | first 6 }}
{{ $shortHash = replace $shortHash "+" "p" }}
{{ $shortHash = replace $shortHash "/" "f" }}
{{ $shortHash = replace $shortHash "=" "e" }}
{{/* $resource.Name is still main.ts instead of main.js */}}
{{/* $resource.RelPermalink will cause a second copy to be published */}}
{{ $newName := replaceRE `\.([^.]*\z)` (printf "-%s.$1" $shortHash) $resource.Name }}
{{ $resource = $resource.Content | resources.FromString $newName }}
<script src="{{ $resource.RelPermalink }}" integrity="{{ $fingerprint }" crossorigin="anonymous"></script>
{{ end }}
I can think of a few possible solutions:
fingerprint
to limit the number of hash characters used to rename the file for cache-busting purposes..RelPermalink
to be mutated.I'm still new to hugo so hopefully I haven't gotten any of the details wrong.
I don't think that provides a solution to the filename issue (though I imagine it's more efficient than using resources.FromString?).
There's no way to get the proper name of the original resource (with .js
) without publishing it.
I just tested resources.Copy
with a fingerprinted file, and it retains the hash in the name. That is, at least for me, unexpected. For example:
{{ resources.Copy "js/app.js" . }} → app.f85a8953420a757cf53434fead706fb0d2864bd93c5f7294fef74d84cfbc20ce.js
So, that's not terribly helpful.
Since you know you're building a js file, is there a problem with hardcoding the extension?
Also, if you can live with a 10 character hash, pass $fingerprint
through https://gohugo.io/functions/crypto/fnv32a/.
I have a partial template that I use for all my resources. It isn't specific to javascript. It also wouldn't be possible to share with others without hard-coding all possible extension changes, or adding a caveat that it won't work for anything other than ts -> js.
I don't believe the hash persisted through a copy when I tried it, but I could be misremembering.
Also, if you can live with a 10 character hash, pass $fingerprint through https://gohugo.io/functions/crypto/fnv32a/.
I believe this still has the same issue: when I name the file, how do I get the correct extension without publishing the resource with the fingerprint in the name? Or hard-coding the extension so it only works with assets types I explicitly handle?
My point with crypto.FNV32a
was that you can eliminate 5 lines of code. This has nothing to do with the extension.
Ah gotcha. Thanks for the tip.
Also see https://github.com/gohugoio/hugo/issues/12143 for access to the extension. I'm also going to log an issue against resources.Copy
, which may be rejected, but it seems to me if I say "copy to foo" it should copy to "foo".
Finally, there's an intentionally undocumented method .Key
which will give you the extension you want, but there is no API promise on this. This is a cache key that just happens to include the extension, but you might be able to use it until the key format changes.
This seems to work OK with v0.124.0:
{{ $assetPath := "ts/main.ts" }}
{{ with resources.Get $assetPath }}
{{ $basename := path.BaseName .Name }}
{{ with . | js.Build | fingerprint }}
{{ $ext := path.Ext .Key }}
{{ $hash := crypto.FNV32a .Data.Integrity }}
{{ with resources.Copy (printf "%s-%d%s" $basename $hash $ext) . }}
<script src="{{ .RelPermalink }}" integrity="{{ .Data.Integrity }}" crossorigin="anonymous"></script>
{{ end }}
{{ end }}
{{ else }}
{{ errorf "Unable to get global asset %q" $assetPath }}
{{ end }}
This produces something like main-4229906572.js
in the root of the public directory.
Use with caution. We may change the value returned by .Key
at any time.
@akbyrd
I use a quick crypto.FNV32a
-based fix for short cache-busting fingerprints that doesn’t directly rely on the unstable .Key
method.
Encoding it to a higher base and using alphanumerics could shave off 1-2 ch.
Originally posted on seirdy.one: See Original.
@Seirdy With this approach you don't have an SRI hash. If you don't fingerprint, the resource's .Data.Integrity
is nil. This issue is about short fingerprints in conjunction with fingerprinting.
This isn't super important, but it feels like having a 96 (sha384) or 64 (sha256) character hash is too much.
Would it be possible to offer an option to limit this, say to 7 characters? Assuming the algorithm is the same, this shouldn't result in any collisions I think?
I could try doing this myself, but I thought I'd ask.
Thanks!