tl;dr: I noticed something a bit inconvenient while working on security proofs, it does not require a change because we can get around the problem and we are so late in the standardization process, but I figured out I would not keep my discovery for myself in case it bothers other people.
The problem
While working on my MLS security proofs, I realized that the transcript hash message formats were a bit problematic.
Let's look at the problem on interim_transcript_hash. To reduce transcript hash collision to a hash collision, we need non-ambiguity of the hash input message format: for each bytestring given as input to the hash (i.e. confirmed_transcript_hash_[epoch] || InterimTranscriptHashInput_[epoch]), it corresponds to a unique pair (confirmed_transcript_hash, InterimTranscriptHashInput).
It would be fine if confirmed_transcript_hash always had the same length, the hash input would be non-ambiguous.
However, it is not the case, with the special case confirmed_transcript_hash_[0] = "", whose length is then differing.
The same problem happens with confirmed_transcript_hash whose hash input is ambiguous.
To make this more concrete, if InterimTranscriptHashInput_[0] = confirmed_transcript_hash_[epoch] || InterimTranscriptHashInput_[epoch], then interim_transcript_hash_[0] = interim_transcript_hash_[epoch], which is problematic.
It's not that bad
We can note that interim_transcript_hash_[0] is never used. Indeed, the group initialization says:
Derive the confirmation_key for the epoch as described in Section 8.
Compute a confirmation_tag over the empty confirmed_transcript_hash using the confirmation_key as described in Section 6.1.
Compute the updated interim_transcript_hash from the confirmed_transcript_hash and the confirmation_tag as described in Section 8.2
So we only need to look at interim_transcript_hash hash input, which is defined as:
struct {
/* same as opaque confirmation_tag<V>; */
MAC confirmation_tag;
} InterimTranscriptHashInput;
In practice, MAC output a fixed-length tag, so the actual InterimTranscriptHashInput that will be used will have fixed-length, which allows to prove non-ambiguity of interim_transcript_hash's hash input message format.
How to do it better
I propose the following solutions, from the ones I prefer to the ones I like less:
instead of doing a concatenation by hand, let's do it in the TLS presentation language: put opaque confirmed_transcript_hash<V>; in InterimTranscriptHashInput (same for ConfirmedTranscriptHashInput)
reverse the concatenation: since both InterimTranscriptHashInput and ConfirmedTranscriptHashInput have internal length tags, it is fine to concatenate arbitrary data after them, similarly as it is done with padding in PrivateMessageContent
initialize the first transcript hashes with things that have the same length as the hash function
tl;dr: I noticed something a bit inconvenient while working on security proofs, it does not require a change because we can get around the problem and we are so late in the standardization process, but I figured out I would not keep my discovery for myself in case it bothers other people.
The problem
While working on my MLS security proofs, I realized that the transcript hash message formats were a bit problematic.
Let's look at the problem on
interim_transcript_hash
. To reduce transcript hash collision to a hash collision, we need non-ambiguity of the hash input message format: for each bytestring given as input to the hash (i.e.confirmed_transcript_hash_[epoch] || InterimTranscriptHashInput_[epoch]
), it corresponds to a unique pair(confirmed_transcript_hash, InterimTranscriptHashInput)
. It would be fine ifconfirmed_transcript_hash
always had the same length, the hash input would be non-ambiguous. However, it is not the case, with the special caseconfirmed_transcript_hash_[0] = ""
, whose length is then differing. The same problem happens withconfirmed_transcript_hash
whose hash input is ambiguous.To make this more concrete, if
InterimTranscriptHashInput_[0] = confirmed_transcript_hash_[epoch] || InterimTranscriptHashInput_[epoch]
, theninterim_transcript_hash_[0] = interim_transcript_hash_[epoch]
, which is problematic.It's not that bad
We can note that
interim_transcript_hash_[0]
is never used. Indeed, the group initialization says:So we only need to look at
interim_transcript_hash
hash input, which is defined as:In practice, MAC output a fixed-length tag, so the actual
InterimTranscriptHashInput
that will be used will have fixed-length, which allows to prove non-ambiguity ofinterim_transcript_hash
's hash input message format.How to do it better
I propose the following solutions, from the ones I prefer to the ones I like less:
opaque confirmed_transcript_hash<V>;
inInterimTranscriptHashInput
(same forConfirmedTranscriptHashInput
)InterimTranscriptHashInput
andConfirmedTranscriptHashInput
have internal length tags, it is fine to concatenate arbitrary data after them, similarly as it is done with padding inPrivateMessageContent