WICG / proposals

A home for well-formed proposed incubations for the web platform. All proposals welcome.
https://wicg.io/
Other
233 stars 16 forks source link

Signature-based SRI. #175

Closed mikewest closed 1 month ago

mikewest commented 1 month ago

Introduction

SRI as currently defined is difficult to deploy in some common scenarios. Of particular interest are resources whose content is dynamic, or at least unknown to the developer at the time they write HTML including them. Introducing a mechanism that allows developers to choose to validate a resource's provenance rather than content can unlock some additional deployment opportunities that can give developers a better chance of enforcing reasonable restrictions on the content they load into their origins' context.

The current proposal is the Simplest Thing That Could Possibly Work™: extend the existing integrity attribute to allow the specification of an Ed25519 public key, and introduce a header (perhaps Content-Digest, despite the syntactical differences (and the fact that signatures aren't really digests)) to allow servers to deliver a signature over the resource's content that can be validated client side.

That is:

<script src="https://my.cdn/script.js"
        crossorigin="anonymous"
        integrity="ed25519-[base64-encoded-public-key]"></script>

and

HTTP/1.1 200 OK
Accept-Ranges: none
Vary: Accept-Encoding
Content-Type: text/javascript; charset=UTF-8
Access-Control-Allow-Origin: *
Integrity: ed25519-[base64-encoded result of Ed25519(`console.log("Hello, world!");`)]

console.log("Hello, world!");

More detail is available in the Explainer. Even more detail in the Monkey-patch Spec)

Feedback

I welcome feedback in this thread, but encourage you to file bugs against the Explainer's repo

igrigorik commented 1 month ago

Thank you for putting this together! 🙇🏻

Very interested to have & test this capability for securing checkout at Shopify. As outlined in the draft, SRI is a critical tool to help with compliance requirements, but current hash-based implementation is impractical for dynamic scripts.

Reading through current draft, couple of quick points..

  1. Content-Digest seems appropriate. @LPardue do you see any issues?
  2. "Do we need a mechanism (another header?) allowing the server to specify the public key used to sign the resource"
    • I would really like to see this. Could we park this under .well-known or something similar? I would love to avoid having to negotiate this problem vendor by vendor and have a well defined mechanism.
mikewest commented 1 month ago

Thanks for your thoughts, @igrigorik!

Both of these seem like good things to hammer out in issues against the proposal's repository if it's accepted as something we can incubate through WICG!

ddworken commented 1 month ago

re: .well-known: +1 to Mike's comment that I don't think a .well-known file would work well here since I expect we'll want to support per-library keys. For example, Google Analytics and Google Ads are two different JS libraries with different build processes, and I expect we may want to sign them with different keys.

I'm wondering: If we did something similar to Mike's suggestion of supporting an Integrity-Public-Key: [base64 encoded pub key] response header, does that actually need any browser-level support? It seems like that would essentially just be a response header that would be used to self-document that signature-based SRI is supported so that developers can manually add it to their scripts. At which point, I'm not sure the browser needs to do anything here other than maybe just having the spec recommend this as a best practice for developer UX. WDYT?

LPardue commented 1 month ago

I don't think Content-Digest is the right fit here. It's specifically about the content in a single HTTP request or response message. For example, if a UA fetched https://my.cdn/script.js using two range requests (one for the first half, one for the second half) would receive two response messages with two different contents, with two different content-digests.

Repr-Digest is independent of message content. For example, the representation is the same, even if you download parts of it separately. However, it's tied to content-encoding and other things. You don't want HTML to have to worry about what CDN is experimenting with the latest compression technology, for example.

What is perhaps closer to the requirement for signature-based SRI is the proposed Identity-Digest field (a name some people hate and would probably change if taken further). In short, the hash is calculated over the represention without encoding. This is more similar to how vanilla SRI works. This idea came up during the standardization of Content-Digest and Repr-Digest but because RFC 9530 was an update to RFC 3230, Identity-Digest was deemed a new thing to punt on. Hence it lives only as an I-D, looking for some implementer interest before considering standards adoption. I'd be open to that if this work is a motivating use case.

WRT HTTP message signatures, I think it might work here. For example, the HTML can indicate the key as already suggest

<script src="https://my.cdn/script.js"
        crossorigin="anonymous"
        integrity="ed25519-[base64-encoded-public-key]"></script>

then the HTTP would be something like

GET /script.js HTTP/1.1
Host: my.cdn
HTTP/1.1 200 OK
Identity-Digest: sha-256=[Structured Field Byte Sequence aka :[base64 of computed digest]:]
Signature-Input: sig1=("identity-digest" ;created=1618884475;keyid="ed25519-[base64-encoded-public-key]"
Signature: sig1=[Structured Field Byte Sequence aka :[base64 of computed signature value basd on signature-input]:];

What this is saying is that the response has an Identity-Digest. The Signature-Input field declares that Identity-digest field is an input into the signature generation, which uses the key identified by keyid. The value of keyid is the same as that indicated in the HTML. The Signature-Input field also creates a label sig1 that identifies the signature value it refers to this is useful if there are multiple signatures (imagine you wanted signing agility, you could use diffferent signature keys to sign the same input fields)

Having a standard for signing things in HTTP is a huge milestone. There's been many other way to do it in the past, and message signature has been able to gain consensus for being compatible with HTTP's weird and wonderful corners. I'd be sad if there was another mechanism invented just for SRI - I think the requirements can be satisfied already.

igrigorik commented 1 month ago

@LPardue TIL! I missed that https://www.rfc-editor.org/rfc/rfc9421.html is a thing. As you pointed out, it seems appropriate and solves the problem at hand. The missing part would be for browser to validate the signature based on provided key in the markup. @mikewest @ddworken wdyt?

@LPardue do you know if and who has adopted this? Do any proxies or CDNs leverage it already?

mikewest commented 1 month ago

@LPardue Thanks for the additional context. I think your suggestion is a good one and I'm happy to adopt it. I still have some questions about the details, but I moved those out to https://github.com/mikewest/signature-based-sri/issues/16 to keep the wrapped up with the document rather than this WICG adoption request.

yoavweiss commented 1 month ago

Given that @igrigorik has put his Shopify hat on, I'll put my WICG hat on instead :)

I think this represents enough industry support for this (and I also suspect this is just the tip of the iceberg, given upcoming PCI requirements).

Feel free to transfer the repo to me, and I'll move it to the WICG org.

mikewest commented 1 month ago

Feel free to transfer the repo to me, and I'll move it to the WICG org.

@yoavweiss: Done, thanks.

yoavweiss commented 1 month ago

The repo now lives at https://github.com/WICG/signature-based-sri

Happy incubation!! 🎉