Explain the purpose of the Digest header

edent commented 9 months ago

As per this question - https://bsky.app/profile/mcphail.uk/post/3kmflwnkhgk2d

What's the purpose of the separate Digest header? The recipient is going to have to hash the body anyway, so wouldn't that suffice to be used in the signature check?

If an attacker can change the contents of the message, they can probably change the contents of the Header - as described in https://www.rfc-editor.org/rfc/rfc9530.html#section-6.1 . So there is little value in a server comparing their calculated hash to the one provided in the message.

I understand that the Digest header also contains the name of the hashing algorithm. That's useful as there may be many different hashing algorithms available.

So I think it is worth explaining that the provided hash should be ignored, the algorithm should be used to generate a new hash, and that new hash is what should be provided to the signature verification algorithm.

evanp commented 9 months ago

Really? This seems like a cheap way to sign the body of the request as well as the headers. Digest is a function of the body, and Signature is a function of the headers, including Digest. So, any jiggery-pokery with the body and/or Digest header will invalidate the Signature.

edent commented 9 months ago

I understand. But if someone can change the body, they can probably change the headers, right?

So validating the digest alone doesn't actually prove anything.

I realise it is part of the spec. But I think users should be warned not to rely on it and, instead, just do signature validation.

omz13 commented 8 months ago

Validating the body (via hash of its content) is to ensure that somebody in the pipeline (like a reverse proxy) hasn't fouled that part. This is quick and cheap to perform and should be the first thing to be checked. If valid, then proceed to validate the Signature (which is far more involved) to check the request's bona fide

Also note the header to check:

The "Digest" header is defined in RF3280 and has an expansive set of hash possibilities.

The "Content-Digest" header is defined in RFC9530 which also obsoletes 3280 and the set of hash possibilities is limited to sha-256 and sha-512.

Signing HTTP Messages (cavage-02 or higher) uses "Digest" from RFC3280 §4.3.2.

HTTP Message Signatures (RFC4921) uses "Content-Digest".

Should probably somewhere say that if using RFC4921, then @signature-params needs to include "content-digest".

edent commented 8 months ago

Validating the body (via hash of its content) is to ensure that somebody in the pipeline (like a reverse proxy) hasn't fouled that part.

But if they have fouled that part, they could also have fouled the header as well.

So, it seems to me, checking the digest is functionally useless. It doesn't tell you anything because you're comparing your calculated digest against a digest which might have been compromised.

Therefore, does it make sense to skip it and just go straight to signature verification? That way you can check if the text has actually become corrupted.

nightpool commented 8 months ago

If I understand correctly, the question here is why we use a header like Digest instead of a pseudo-header like (body-digest) or similar in the signature params, correct?

The answer to this is simply composability/reuse—It's easier and simpler for the spec to reuse the Digest/Content-Digest header from another specification rather then invent an entirely new one-off "how to create the digest of a body" rules specifically for the HTTP Signatures spec. Conceptually, the fundamental building block of the spec is the list of headers to be signed. Using Digest means that we can sign the body through the same paradigm without inventing any new special cased logic.

edent commented 8 months ago

Not quite.

Here's what I'm trying to say.

There is no point in the server comparing their self-calculated digest with the digest in the header.
If the message has been tampered with en route, the headers may also have been compromised.
Comparing the server-calculated digest with the provided digest tells you nothing.

So, the server should skip comparing the digest and move straight to signature verification.

Let's say the signature header contains:

headers="(request-target) host date digest content-type",

That says, I need to concatenate several pieces of information and verify the signature.

In this case:

(request-target): post /inbox
host: example.com
date: Sun, 25 Feb 2024 10:48:22 GMT
digest: SHA-256=Hqu/6MR2imi8DTzbNp5PNEAFSyk0poN7+x5F+Z4vZMg=
content-type: application/activity+json

But where do I get that digest from? Again, the digest provided by the header may have been compromised en route. This means there's no point using the server-provided digest in signature verification.

So our server should calculate its own digest of the message and use that for signature verification.

This means at no point has the provided digest been used for anything.

I think my preferred solution would be to say something like "The message may come with a Digest header. This should be ignored and not used for comparison or signature verification. Instead, servers should calculate the digest of the message they've received and use that for signature verification."

tesaguri commented 8 months ago

You might not mean it, but nightpool's viewpoint is also relevant to real-world implementation designs, not only the elegance of the specs, which justifies the comparisons against the digest header value (the issue's very topic).

By decomposing the digest into a separate header, implementations can separate the signature generation/verification logics into a layer that handles the digest and another layer that handles the signature, which can contribute to a clean design. But a guidance against comparing the digest header value from the request limits the possible implementation strategies of these layers.

But I think it's still another possible approach to ignore the provided digest as you suggested, so my preferred wording would be the provided hash ~~should~~ may be ignored.

omz13 commented 8 months ago

So, the server should skip comparing the digest and move straight to signature verification.

another possible approach to ignore the provided digest as you suggested, so my preferred wording would be the provided hash should may be ignored.

If a content hash is provided, and you ignore it, then you open the door for a man-in-the-middle attack (the signature will validate; changes to the body will go unchecked; nefarious actor wins).

a digest-esque header MUST be provided

the signature MUST include the digest-esque header

the digest-esque header is digest per RFC3280 (using a hash SHA-256)

the only thing to note is that digest is technically obsoleted by content-digest (because RFC9530 obsoleted RFC3280)... but I doubt if anybody cares because cavage-xx is technically replaced by RFC4921 and everybody seems happy to continue using cavage-xx.

(Just because there is a newer, but not necessarily better, standard does not mean people will use it.)

edent commented 8 months ago

If a content hash is provided, and you ignore it, then you open the door for a man-in-the-middle attack (the signature will validate; changes to the body will go unchecked; nefarious actor wins).

I think this is the bit I'm confused about and would appreciate some help on.

If an attacker can change the content of the message, they can also change the content of the Digest header. Is that correct?

If so, what does the comparison show? Or is it just a defence in case they've altered one but not the other?

omz13 commented 8 months ago

If an attacker can change the content of the message, they can also change the content of the Digest header. Is that correct?

Not necessarily: it depends on what architecture generates the response and the pipeline between the participants. A forward-proxy, reverse-proxy, http-proxy, etc, can each foul things in different ways (either because nefarious or poorly implemented). For instances using a service-oriented-architecture the more actors there are in that pipeline, the more "opportunities" for something to "happen".

If so, what does the comparison show? Or is it just a defence in case they've altered one but not the other?

a digest of the body ensures that the body is valid.

the signature ensures that the header(s) specified have not been tampered with, ensuring that the digest header is valid.

yes, it is a layered (onion) approach in that a nefarious actor will have to foul both the digest header and the signature.

validate the digest first; then validate the signature.

proceed as applicable.

nightpool commented 8 months ago

@edent

If so, what does the comparison show? Or is it just a defence in case they've altered one but not the other?

In this case, I think really it just increases debuggability and DX—if you compare the signatures directly, then when you get an error you have no way of telling where the error came from. If you compare the digests first, and then compare the signatures, you can give a more informative error message.

swicg / activitypub-http-signature

Explain the purpose of the Digest header #28