Closed martinthomson closed 6 years ago
Do you mean:
Digest: MI-SHA256=base64==
to allow forward-compatibility with future hash functions?
There's some mismatch between the semantics of the top-level mi-sha256 hash and the meaning of the Digest
header. Specifically,
The digest is computed on the entire instance associated with the message. The instance is a snapshot of the resource prior to the application of any instance manipulation or transfer-coding (see section 3).
The entity that would be returned in a status-200 response to a GET request, at the current time, for the selected variant of the specified resource, with the application of zero or more content-codings, but without the application of any instance manipulations or transfer-codings.
Since mi-sha256
is a content-coding, which can appear anywhere in the list of content-codings, the MICE spec would have to be careful to say how this new kind of digest is checked.
Side note: I've been thinking of mi-sha256 as a content encoding that's parameterised by the top-level hash: only the message matching that hash successfully decodes. The question in this issue is then, how do we best communicate the parameter of a content encoding, given that https://tools.ietf.org/html/draft-ietf-httpbis-semantics-02#section-6.1.2 doesn't allow them to take parameters directly.
@mnot, if you know someone from the CDN world who might have opinions on this re-use of the Digest
header, could you point them over here?
There's some mismatch between the semantics of the top-level mi-sha256 hash and the meaning of the Digest header....
As mi-sha256 depends on the block size, we should probably add the block size, eg.
MI-SHA256/1024=base64==
or
MI-SHA256=1024,base64==
In that way, mi-sha256 will be content-encoding independent.
Upload a file with MICE
PUT /files/image.png
Digest: MI-SHA256/1024=base64==
log the mi-sha256/1024 header on the server
Verify posthumously the integrity of the mi-sha256/1024
--
how do we best communicate the parameter of a content encoding, given that https://tools.ietf.org/html/draft-ietf-httpbis-semantics-02#section-6.1.2 doesn't allow them to take parameters directly.
@jyasskin (iiuc or forgive me and ignore) in #12 I suggest:
MI
header, adding the mi-sha256 to the payloadDigest
form to represent an actual resource digest Which are the drawbacks of this approach?
Thanks to you all for your time!
The identification doesn't need to know the block size: if the block size that the digest assumes is wrong, the hash will simply not match.
@jyasskin is right: Digest
is an entity hash while MI
without the rs
is not
to use Digest
we need to re-add rs
to the header, like in https://tools.ietf.org/html/draft-thomson-http-mice-01#section-6.3.2 before #2
Pros of re-adding rs
:
we can directly use the mi-sha256 as a content hash
MICE will provide both integrity on the wire and for entities
Cons:
rs
information (both in the header and in the payload)@ioggstream Moving the record size to the header is not sufficient to make the top-level MI digest a digest of the "instance" that the Digest
header says it holds digests of. MI assumes an intermediate hash for every record-size bytes, and those intermediate hashes have to be transmitted somewhere. If those intermediate hashes are transmitted in the header, then no content encoding is needed, and Digest
directly applies, but FAQ 3. If the intermediate hashes are transmitted in the body, especially if a second content encoding is applied after mi-sha256
, then the top-level hash is no longer a digest of the post-content-encoding instance.
@jyasskin
1- let the function m-hash: (payload, rs) -> (rs, top-level MI digest)
2- m-hash
is based on all intermediate hashes
3- do you say you need to transmit all intermediate hases because 2 is a too weak condition for integrity?
In case of multiple encodings you're right. Just found a clarifying thread on Digest
and instances here. https://lists.w3.org/Archives/Public/ietf-http-wg/2018AprJun/0197.html
@martinthomson and I chatted, and we're going to make this change. I can't promise to get it done before I go on leave in a couple weeks, so someone else is welcome to pick it up, or I'll start on it in November.
Hi @jyasskin @martinthomson is there a summary of the discussion?
It was just that my concern (which was only a concern, not an objection) didn't bother Martin, and he has lots more experience dealing with HTTP headers.
The terminology that Jeff proposed in 3230 was never adopted in HTTP, so that spec probably needs to be revised. The closest thing to "instance" in current HTTP is selected representation. If that's the semantics you're looking for (i.e., you can send a Digest header w/MICE for the "whole" response on a 304 or a 206 and it still makes sense), you should be fine.
N.B. Content-Encoding is a property of the representation.
If we re-do Digest
to cover the "selected representation", I think my worry above goes away. It's just an additional yak to shave. 😜
I can ask this on the list once I'm back from leave, but do you know offhand if anyone else is interested in updating RFC3230, vs if we're only doing it to support MICE?
Well, Content-MD5
is deprecated, and Digest
so kind of the go-to now. It probably could use a refresh.
The only issue (besides finding time) is untangling it from delta encoding; I'm not sure how hard it would be to make it compatible with both modern HTTP and delta.
Until then, I don't think it does harm to use it.
@mnot:
Digest
using the 723* terminology?It's not widely used. Revising it is a matter of specification work.
I don't think it's a big deal if you go ahead and use Digest
without revising it*; we'll get to it eventually.
@mnot @martinthomson @jyasskin Me and @LPardue are trying to refresh Digest
referencing rfc:7231
and make a cleaner RFC with some examples. Algos may reference mi-sha256 too.
Here's a gdoc. Feedback welcome! https://docs.google.com/document/d/1p8KBR_dQKfh7PgLTOYXg_htMAME8nnEblKRC0DSH5_o/edit?ts=5cb5aff0
The encoding would be different, but I can't see a reason not to use this:
Thanks to @ioggstream.