Why is reporting hashes needed?

estark37 commented 3 weeks ago

Could you expand a bit more on why the hash reporting functionality is useful? I can see the value in getting reports of all the script URLs loaded on your page, but I'm not totally convinced as to why you wouldn't just compute the hashes of those resources offline and then include the offline-computed hashes in integrity tags. The explainer says that some resources might change dynamically, but in those cases you probably need signature-based SRI and collecting the hashes via reporting isn't very helpful because it won't be complete. Is the idea that by reporting hashes you can determine which of the subresources change dynamically and therefore can't use hash-based SRI?

yoavweiss commented 3 weeks ago

Could you expand a bit more on why the hash reporting functionality is useful? I can see the value in getting reports of all the script URLs loaded on your page, but I'm not totally convinced as to why you wouldn't just compute the hashes of those resources offline and then include the offline-computed hashes in integrity tags.

Sure.

The threat model here is similar to SRI - we want to be able to detect resources that were tampered with either at rest or during delivery. But unlike SRI, we contend with being able to report the resources executing, rather than preventing them from executing before they do. (for practical deployment reasons. We still want to deploy SRI where ever feasible)

The explainer says that some resources might change dynamically, but in those cases you probably need signature-based SRI

Signature-based SRI would definitely help here in ensuring provenance. But the hash would (retroactively) give us the ability to identify the actual resource that ran.

collecting the hashes via reporting isn't very helpful because it won't be complete

In what way?

estark37 commented 3 weeks ago

What I don't understand is that it seems that the motivating use case is subresources whose hash changes often, maybe even on ~every load ("dynamic, ever-green scripts that can be updated by their provider at any moment") -- for such resources, it seems that you're never going to collect a complete set of hashes, and you instead would have to use signature SRI or some other mechanism. Hash-based SRI just seems fundamentally incompatible with a subresource that changes often and unpredictably.

yoavweiss commented 3 weeks ago

The reason I want to collect the hashes is that I could retroactively compare them with a list provided by their providers (out of band). That would enable me to retroactively detect tampering.

I agree that signature-based SRI would enable enforcement of provenance (even if not of contents). Having both would be ideal.

estark37 commented 3 weeks ago

Ahh, ok, I misunderstood your explanation in https://github.com/yoavweiss/subresource-reporting/issues/1#issuecomment-2455169791. I was thinking that the point of this was to aid deployment of SRI, not as an actual security mechanism in itself.

Though I'm still a bit confused as to what you envision doing with the reported hashes. If you have a list of hashes from the resource provider, wouldn't you just enforce SRI with that list of hashes? And if the provider can't provide a list of hashes, then there's no way to know if a reported hash is good or tampered-with.

yoavweiss commented 3 weeks ago

The provider can provide a list of hashes after they were deployed, but not before. That can help us know that something went wrong (even if retroactively), and e.g. pull the plug on relevant scripts while we figure it out.

yoavweiss commented 3 days ago

Closing as I think the question was answered. Feel free to let me know if that's not the case.

yoavweiss / subresource-reporting

Why is reporting hashes needed? #1