martinthomson / http-mice

A progressive integrity content encoding for HTTP
3 stars 2 forks source link

Allow to specify `rs` as an optional Digest parameter #16

Open ioggstream opened 5 years ago

ioggstream commented 5 years ago

I expect

Instead

Proposal

Digest: mi-sha256=...;rs=1024

Note

Questions

1) if you sign the mi-sha256 value, but not the record-size, isn't it easier for an intermediary to find a collision leveraging the record-size? 2) if 1) is unfeasible, can we use mi-sha256(rs, unencoded-body) instead of sha256(unencoded-body) as an integrity proof?

jyasskin commented 5 years ago

I think it would be plausible to move the record size into the headers, but I wouldn't want to copy it because of the risk that the two copies might wind up with different values.

https://tools.ietf.org/html/rfc3230#section-4.2 doesn't allow instance-digests to have parameters, so I think we'd need to stash the record size either at the front of the <encoded digest output> or in the name of the content encoding, either of which seems fine to me.

I'm still not seeing the use case, though. If you want to verify an HTTP response, it seems reasonable to keep the original response around. If you want to verify the result of processing an HTTP response, it seems reasonable to stash some extra metadata in the application, which could include the original record size.

Regarding the questions, if the attacker can find a collision at all, I believe SHA-256 would be considered broken, whether or not they needed to adjust the record size to do it.

ioggstream commented 5 years ago

Hi @jyasskin, thanks for your reply!

About the ability to pass parameters to instance digest algorithms, can you help me to clarify the meaning of https://tools.ietf.org/html/rfc3230#section-4.1.1 ? This would be useful for https://github.com/ioggstream/draft-polli-resource-digests-http/issues/34

Digest algorithm values are used to indicate a specific digest computation. For some algorithms, one or more parameters may be supplied.

  digest-algorithm = token

The BNF for "parameter" is as is used in RFC 2616 [4]. All digest- algorithm values are case-insensitive.

martinthomson commented 5 years ago

This is an interesting question. I assume that this is the result of having saved or acquired some bits, then you are presented with a M-I hash without context. You can't use one set of bits to validate the other without knowing where the cut points are.

The move in #2 was intentional. We agreed that it was important to be able to remove content codings without providing context. Though you might not be able to use the M-I without being provided with a hash, including the record size in a header allows the content to be used as long as you know that the content was M-I coded. I don't want to roll that back.

It seems to me that the answer we might need here is different than what you have both contemplated.

Rather than parameterize the key, it might pay to encode the record size in the proof as well. That way, you can at validate content no matter how it was obtained. That introduces a cost in processing if the content is not M-I coded or it is coded with a different record size.

ioggstream commented 5 years ago

Rather than parameterize the key, it might pay to encode the record size in the proof as well

If you mean 1024/dcRDgR2GM35DluAV13PzgnG6+pvQwPywfFvAu1UeFrs= it could be ok.

That introduces a cost in processing if the content is not M-I coded or it is coded with a different record size.

Yes, but only if you need to validate the checksum after having decoded the file.

martinthomson commented 5 years ago

If you mean 1024/dcRDgR2GM35Dl...

Something like that, yes.

jyasskin commented 5 years ago

If we encode the record size in the proof (which is what I meant by "stash the record size either at the front of the <encoded digest output>"), I think the web platform would want to reject the proof if the two record sizes didn't match or if the content wasn't mi-encoded, to avoid needing to store the entire response in those cases. But then we'd have situations where the two ways of checking the proof give different answers, which I'm not super comfortable with.

ioggstream commented 4 years ago

the web platform would want to reject the proof if the two record sizes didn't match

If that happens, it means somebody altered the header: in http-signature use case this means the signature validation fails way before validating the encoded content, right?

or if the content wasn't mi-encoded

If the content is not mi-encoded, the Digest spec allows the recipent to skip validation: it doesn't have to reject.

the two ways of checking the proof give different answers

Can you provide some more context on that case?