mlswg / mls-protocol

MLS protocol
https://messaginglayersecurity.rocks
Other
233 stars 60 forks source link

Minor problem in transcript hash message formats #881

Closed TWal closed 1 year ago

TWal commented 1 year ago

tl;dr: I noticed something a bit inconvenient while working on security proofs, it does not require a change because we can get around the problem and we are so late in the standardization process, but I figured out I would not keep my discovery for myself in case it bothers other people.

The problem

While working on my MLS security proofs, I realized that the transcript hash message formats were a bit problematic.

confirmed_transcript_hash_[0] = ""; /* zero-length octet string */
interim_transcript_hash_[0] = ""; /* zero-length octet string */

confirmed_transcript_hash_[epoch] =
    Hash(interim_transcript_hash_[epoch - 1] ||
        ConfirmedTranscriptHashInput_[epoch]);

interim_transcript_hash_[epoch] =
    Hash(confirmed_transcript_hash_[epoch] ||
        InterimTranscriptHashInput_[epoch]);

Let's look at the problem on interim_transcript_hash. To reduce transcript hash collision to a hash collision, we need non-ambiguity of the hash input message format: for each bytestring given as input to the hash (i.e. confirmed_transcript_hash_[epoch] || InterimTranscriptHashInput_[epoch]), it corresponds to a unique pair (confirmed_transcript_hash, InterimTranscriptHashInput). It would be fine if confirmed_transcript_hash always had the same length, the hash input would be non-ambiguous. However, it is not the case, with the special case confirmed_transcript_hash_[0] = "", whose length is then differing. The same problem happens with confirmed_transcript_hash whose hash input is ambiguous.

To make this more concrete, if InterimTranscriptHashInput_[0] = confirmed_transcript_hash_[epoch] || InterimTranscriptHashInput_[epoch], then interim_transcript_hash_[0] = interim_transcript_hash_[epoch], which is problematic.

It's not that bad

We can note that interim_transcript_hash_[0] is never used. Indeed, the group initialization says:

So we only need to look at interim_transcript_hash hash input, which is defined as:

struct {
    /* same as opaque confirmation_tag<V>; */
    MAC confirmation_tag;
} InterimTranscriptHashInput;

In practice, MAC output a fixed-length tag, so the actual InterimTranscriptHashInput that will be used will have fixed-length, which allows to prove non-ambiguity of interim_transcript_hash's hash input message format.

How to do it better

I propose the following solutions, from the ones I prefer to the ones I like less:

bifurcation commented 1 year ago

RFC 9420 has been published, so we can't address this in the base MLS protocol. If we do a next version / revision, we could take this up.