Clarify relationship between Ratchet Tree Evolution, Group Creation, and Key Schedule

knightcode commented 3 years ago

The terminology used in these sections is different, so that it's unclear if they're descriptions of the same operation with different levels of detail or entirely different operations performed at different times. I come in having read most of the original white paper, so that my intuition is that it's one operation performed against data stored in the tree structure. As such, I'm unclear how the init_secret->epoch_secret relates to the leaf_secret via the "number of secrets [that] are derived from the epoch secret for different purposes".

Also, The Group Creation section makes no reference to any of these secrets by name. I would kind of expect something called, init_secret, to be involved during group creation, but it doesn't seem like that's necessary.

is authenticated_data defined anywhere? Is it application level data to be included as desired?

knightcode commented 3 years ago

Can you add the definition of commit_secret definition to Ratchet Tree Evolution?

knightcode commented 3 years ago

In Encryption Keys, in the diagram copied below, the label, "secret", is passed to DeriveTreeSecret(), where in the preceding section, it was either "application" or "handshake". This is intentional?

ratchet_secret_[N]_[j]
      |
      +--> DeriveTreeSecret(., "nonce", N, j, AEAD.Nn)
      |    = ratchet_nonce_[N]_[j]
      |
      +--> DeriveTreeSecret(., "key", N, j, AEAD.Nk)
      |    = ratchet_key_[N]_[j]
      |
      V
DeriveTreeSecret(., "secret", N, j, KDF.Nh)
= ratchet_secret_[N]_[j+1]

knightcode commented 3 years ago

Could you specifically call out where the authentication tag created by AES-256-GCM is stored? Some implementations will need to find it and pass it in as a param to open().

knightcode commented 3 years ago

The group creation and external init parts of the doc each have discussions in two parts. It'd be easier to follow if they were consolidated. I would suggest moving the group creation section up to the other relevant parts and maybe introduce the other concepts in the context of group creation... that's how implementations would start.

knightcode commented 3 years ago

I'm not sure why there needs to be two ratchets per leaf node. It seems like they never get ratcheted at the same time, and they both get reset during the same event(s).

knightcode commented 3 years ago

It was easier for me to understand the resolution concept as just a kind of depth first search through the sibling's tree, favoring the left child, for a public or private key. If building an update path follows the addition of empty nodes, then for each step up the tree, the sibling's tree is not going to have filled any new key pairs yet, so that conducting a search at that moment is only going to resolve a public key from the previous epoch.

knightcode commented 3 years ago

What does this mean? Is part of the sentence missing?

Truncate the tree by reducing the size of tree until the rightmost non-blank leaf node

knightcode commented 3 years ago

Leaving it to applications to determine how many resumption secrets to maintain creates a problem with ExternalInit's. The external init is predicated entirely on public group state, so that the new member(s) will not have any resumption secrets from prior epochs as determined by the application. If one of those resumption secrets is employed, the group fractures.

Moreover, I'm not seeing how one resumption_secret alone is enough for a recovery operation... if that's one of its intended uses, which seems to be implied. There's not much to go on in the doc, but it seems to suggest the resumption secret is included in the key derivation schedule as a PSK. But the schedule also requires an init_secret and a commit_secret as inputs, which a recovering node cannot rely on to be in sync with the group.

knightcode commented 3 years ago

For members, A, C, and D of a 4 member group, where A performed the additions of B,C, and D and then broadcast a Welcome message to all the others, their respective trees would look like the following after each adds their own private keys locally:

         p[3]                  p[3]                   p[3]
        /     \              /      \               /     \
     p[1]       _          p[1]       _          p[1]       _
     /  \      / \         /  \      / \         /  \      / \
    p[A] B    C   D       A    B  p[C]  D       A    B    C p[D]

p[x] denotes that a public/private key pair resides at node x

If node performs an another tree update, e.g. adds another member, when computing the updatePath for node 3, it must select a public key from the subtree rooted at node 5 (the underscore). No key pair has been computed for node 5 yet, so that it would select the public key for node C and proceed. This, however, prevents node D from decrypting the path_secret for node 3 because D does not know the private key for C. If A is to encrypt the path_secret multiple times---each with a different key---what's the protocol? How many copies are enough? Does the number of copies have to grow with the size of the tree?

kkohbrok commented 3 years ago

Hi @knightcode,

thanks for raising all these issues. There's an effort under way to give the draft an overhaul from an editorial standpoint. That should help with some of the issues around definitions and structuring/sequence of sections.

Regarding DeriveTreeSecret: DeriveTreeSecret is just a function that's used in multiple places and the label changes depending on what kind of secret we're deriving.

Regarding storage of authentication tags: I'm not sure what you mean by "store" here? This sounds like an implementation specific question that the draft won't be able to answer.

Regarding two ratchets per leaf node: I assume you mean that from a leaf secret we derive a node secret and a path secret. This is to ensure that keys are cryptographically separate, such that they can be used by different cryptographic primitives. Cryptographic assumptions on primitives such as KDFs, authenticated encryption, etc usually assume that the key is only used with that primitive. As a consequence, proving the security of the protocol would likely require non-standard assumptions if one were not to separate keys properly.

Regarding truncation: The sentence is meant to express that, starting from the rightmost leaf, leaves are removed until the first non-blank leaf is reached.

Regarding resumptions and ExternalInits: Good catch! That should be added to the section on ExternalInit.

Recovery through resumption secrets: When creating a group, it can be linked to an existing group through the use of resumption secrets. Linking a new group to an existing group means that the creator follows the normal group creation process, i.e. creating a group and adding other members, sending Welcome messages, etc. However, when adding the new members, the group creator also injects a PSK derived from the resumption secret (from the existing group) into the key schedule. Thus, as per the "normal" group creation process, no other shared secrets are needed, excpet the PSK derived from the resumption secret for the step where the members are added.

Regarding you most recent example: If I understand the scenario correctly, I believe you are not computing the resolution properly. The node sending the update would encrypt the path secret of node 3 to the resolution of the blank node. The resolution of a blank node is computed by taking the resolution of its two children, which in this case are both C and D.

I hope that answered some of your questions. If you have questions regarding the draft in general, the mailing list mls@ietf.org is usually a better place to get answers. Also, if you're working on an implementation, there's a Wire group for discussions regarding implementation issues and interop between the individual implementations. See here for the links. We're always happy to see new implementations!

knightcode commented 3 years ago

By "two ratchets", I was referring to the secrets derived at the leaves of the secret tree... i.e. the final values computed by DeriveTreeSecret. There's one using a label of "application" and one using a label of "handshake". The former is ratcheted for application messages, and the latter for handshake messages. In the diagram I referenced above, the label switches to "secret"... probably from an earlier iteration of the doc. In any case, I don't see a reason to have two secrets here. It's more to maintain for little benefit.

Simply shifting leaves and then truncating the right side of the tree is sufficient to properly remove any arbitrary leaf from the tree? I guess maybe that works with the resolution computation you described, but that could like unmerge half the leaves between the departing index and the end.

For the AES-256-GCM authentication tag, for the crypto library in Node.js, the tag is retrieved with a separate call after the cipher is finalized (cipher.getAuthTag()), so that the ciphertext and the tag are two pieces of information that need to get formatted into a packet and transmitted. I understand that this is just one algorithm and others may only have the ciphertext. But for the ciphersuites that employ AES-256-GCM, two implementations might format the packets differently and be inoperable with each other. ..or there's already an agreed upon formatting that I don't know about. Moreover, the length of the tag is a configurable param for the algorithm, which should be specified somewhere.

I get it now that recovery is just group creation using the same group ID with a resumption secret thrown in. I kind of like it being an External Init with the resumption secret thrown in. One message instead of N messages.

Mailing lists are terrible at organizing information (not that I'm doing any better putting all my topics in one Issue).

bifurcation commented 2 years ago

Hi @knightcode , finally getting around to addressing these comments. Sorry it took a bit!

Notes from a first read-through (before looking at the exchange with @kkohbrok):

Terminology between Ratchet Tree Evolution, Group Creation, Key Schedule - I think this has been sorted out by some more recent changes.
Definition of commit_secret - This doesn't go in {{ratchet-tree-evolution}}, it is defined in the commit generation / handling sections
"secret" / "handshake" / "application" - Yes, this is intentional. The first derivation splits the application and handshake key chains, then both evolve via "secret" derivation
GCM authentication tag - We assume the RFC 5116 AEAD interface, which has a unitary ciphertext, not a separate ciphertext+tag. Added a note to clarify.
Group creation and external init parts each have discussions in two parts - Not immediately clear how to address this; will consider in a broader review.
Two ratchets per leaf node - Assuming you're talking about the path_secret vs. node_secret distinction, this is to assure that each secret is only used with one algortihm. I have added a note to clarify.
Easier to understand resolution as depth-first search - Added this to the description of resolution.
"Truncate the tree..." - Yep, missing part of the sentence. Reworded.
Resumption secret & ExternalInit - There's no connection between ExternalInit and resumption secrets (just external_pub), and in any case, the only epoch for which an ExternalInit can be constructed is the current one. Otherwise, you're forking history. You're correct that the resumption secret is not sufficient to join a group, and it's not meant to be. It is only proof of prior involvement with the group.
For members A, C, D, ... - Your representation of the secret knowledge here isn't quite right. C and D wouldn't know p[1], since it's not on their direct path. In your hypothetical, if A or B did an operation, they would encrypt to the resolution of node 5, that is to both C and D. This diagram seems useful for building intuition, though, so I've added something like it to the ratchet tree section.

========

Notes on responses to @kkohbrok:

The separate application and handshake ratchet to ensure that application and handshake messages don't fate-share -- if the application keys are compromised, that doesn't compromise the handshake keys. While this might not matter much in practice, I believe it helps with the security analysis.
Truncation is not how members are removed from the tree. That's done by blanking a member's leaf and parents. Truncation is just garbage collection. If you have a 1000-member group where leaf 1 is blank (starting at zero), and the member at leaf 2 removes members at leaves 3-1000, then you'll end up with a tree with three leaves (member A, blank, member B), and truncation will clear out the now-blank leaves 3-1000 and all their parents.
Yeah, there's unfortunate divergence in AES-GCM interfaces. The good news is that even if an implementation produces a different layout, as long as you can find the tag, you can rearrange things to match the desired layout here (ciphertext || tag).

mlswg / mls-protocol

Clarify relationship between Ratchet Tree Evolution, Group Creation, and Key Schedule #468