Closed mikewest closed 3 weeks ago
Thank you for putting this together! 🙇🏻
Very interested to have & test this capability for securing checkout at Shopify. As outlined in the draft, SRI is a critical tool to help with compliance requirements, but current hash-based implementation is impractical for dynamic scripts.
Reading through current draft, couple of quick points..
.well-known
or something similar? I would love to avoid having to negotiate this problem vendor by vendor and have a well defined mechanism. Thanks for your thoughts, @igrigorik!
content-digest
: I have some questions here. It's not clear to me that SRI and content-digest
are acting on the same bytes: the ordering between evaluating content-encoding and applying the digests, for example, isn't obvious to me in the RFC. It's also not clear to me what the story is for signatures over content rather than hashes of content. The RFC suggests that Message Signatures could be used to sign a digest field, for instance. Maybe that's the right model, but I've only lightly skimmed that latter document and it seems like substantially more complexity than we'd need/want here. 🤷 At the same time, I'm interested in not reinventing wheels if there are sufficiently-round things available.
.well-known
: We could certainly spell out a way to allow developers to specify a key they plan to use. That might complicate the substitution story (in which @ddworken suggested folks could use distinct keys per resource), but no reason we couldn't make discovery more available as an option. That said, I don't think it would have the same effect as a key delivered with the resource, at least insofar as we wouldn't either be able to error out early in cases of key-mismatch, or do the client-side checks suggested in https://mikewest.github.io/signature-based-sri/#issue-e791903e without pulling that well-known file down at some point. That introduces caching questions and/or perf impact that I think we'd probably like to avoid.
Both of these seem like good things to hammer out in issues against the proposal's repository if it's accepted as something we can incubate through WICG!
re: .well-known
: +1 to Mike's comment that I don't think a .well-known
file would work well here since I expect we'll want to support per-library keys. For example, Google Analytics and Google Ads are two different JS libraries with different build processes, and I expect we may want to sign them with different keys.
I'm wondering: If we did something similar to Mike's suggestion of supporting an Integrity-Public-Key: [base64 encoded pub key]
response header, does that actually need any browser-level support? It seems like that would essentially just be a response header that would be used to self-document that signature-based SRI is supported so that developers can manually add it to their scripts. At which point, I'm not sure the browser needs to do anything here other than maybe just having the spec recommend this as a best practice for developer UX. WDYT?
I don't think Content-Digest is the right fit here. It's specifically about the content in a single HTTP request or response message. For example, if a UA fetched https://my.cdn/script.js using two range requests (one for the first half, one for the second half) would receive two response messages with two different contents, with two different content-digests.
Repr-Digest is independent of message content. For example, the representation is the same, even if you download parts of it separately. However, it's tied to content-encoding and other things. You don't want HTML to have to worry about what CDN is experimenting with the latest compression technology, for example.
What is perhaps closer to the requirement for signature-based SRI is the proposed Identity-Digest field (a name some people hate and would probably change if taken further). In short, the hash is calculated over the represention without encoding. This is more similar to how vanilla SRI works. This idea came up during the standardization of Content-Digest and Repr-Digest but because RFC 9530 was an update to RFC 3230, Identity-Digest was deemed a new thing to punt on. Hence it lives only as an I-D, looking for some implementer interest before considering standards adoption. I'd be open to that if this work is a motivating use case.
WRT HTTP message signatures, I think it might work here. For example, the HTML can indicate the key as already suggest
<script src="https://my.cdn/script.js"
crossorigin="anonymous"
integrity="ed25519-[base64-encoded-public-key]"></script>
then the HTTP would be something like
GET /script.js HTTP/1.1
Host: my.cdn
HTTP/1.1 200 OK
Identity-Digest: sha-256=[Structured Field Byte Sequence aka :[base64 of computed digest]:]
Signature-Input: sig1=("identity-digest" ;created=1618884475;keyid="ed25519-[base64-encoded-public-key]"
Signature: sig1=[Structured Field Byte Sequence aka :[base64 of computed signature value basd on signature-input]:];
What this is saying is that the response has an Identity-Digest
. The Signature-Input
field declares that Identity-digest
field is an input into the signature generation, which uses the key identified by keyid
. The value of keyid is the same as that indicated in the HTML. The Signature-Input
field also creates a label sig1
that identifies the signature value it refers to this is useful if there are multiple signatures (imagine you wanted signing agility, you could use diffferent signature keys to sign the same input fields)
Having a standard for signing things in HTTP is a huge milestone. There's been many other way to do it in the past, and message signature has been able to gain consensus for being compatible with HTTP's weird and wonderful corners. I'd be sad if there was another mechanism invented just for SRI - I think the requirements can be satisfied already.
@LPardue TIL! I missed that https://www.rfc-editor.org/rfc/rfc9421.html is a thing. As you pointed out, it seems appropriate and solves the problem at hand. The missing part would be for browser to validate the signature based on provided key in the markup. @mikewest @ddworken wdyt?
@LPardue do you know if and who has adopted this? Do any proxies or CDNs leverage it already?
@LPardue Thanks for the additional context. I think your suggestion is a good one and I'm happy to adopt it. I still have some questions about the details, but I moved those out to https://github.com/mikewest/signature-based-sri/issues/16 to keep the wrapped up with the document rather than this WICG adoption request.
Given that @igrigorik has put his Shopify hat on, I'll put my WICG hat on instead :)
I think this represents enough industry support for this (and I also suspect this is just the tip of the iceberg, given upcoming PCI requirements).
Feel free to transfer the repo to me, and I'll move it to the WICG org.
Feel free to transfer the repo to me, and I'll move it to the WICG org.
@yoavweiss: Done, thanks.
The repo now lives at https://github.com/WICG/signature-based-sri
Happy incubation!! 🎉
Introduction
SRI as currently defined is difficult to deploy in some common scenarios. Of particular interest are resources whose content is dynamic, or at least unknown to the developer at the time they write HTML including them. Introducing a mechanism that allows developers to choose to validate a resource's provenance rather than content can unlock some additional deployment opportunities that can give developers a better chance of enforcing reasonable restrictions on the content they load into their origins' context.
The current proposal is the Simplest Thing That Could Possibly Work™: extend the existing
integrity
attribute to allow the specification of an Ed25519 public key, and introduce a header (perhapsContent-Digest
, despite the syntactical differences (and the fact that signatures aren't really digests)) to allow servers to deliver a signature over the resource's content that can be validated client side.That is:
and
More detail is available in the Explainer. Even more detail in the Monkey-patch Spec)
Feedback
I welcome feedback in this thread, but encourage you to file bugs against the Explainer's repo