Need info about multiple recipients

dhh1128 commented 3 years ago

This section of the spec:

https://identity.foundation/didcomm-messaging/spec/#key-ids-kid-and-skid-headers-references-in-the-did-document

...contains an example and discussion only for a case where Bob is the single recipient of Alice's message, and Bob only has one key in the keyAgreement section of his DID doc. We need an example that shows Bob having multiple keys, and also a (probably separate) example where Bob and Carol are both recipients of the same message from Alice.

dhh1128 commented 3 years ago

tagging @vimmerru

baha-ai commented 3 years ago

Whether Alice (as a sender) has one or multiple keys in the KeyAgreement it does not change the fact that the one used as skid (also apu base64) is marked with the specific KeyAgreement ID. I don't think the spec should dictate which one to use if there are multiple keys as KeyAgreement, the receiving agent should loop through the sender's DID doc to find a matching KeyAgreement ID or return an error if none found.

baha-ai commented 3 years ago

I believe the spec should mention however that no more than 1 key must be used per recipient when building the message.

dhh1128 commented 3 years ago

no more than 1 key must be used per recipient when building the message

Thanks for your comments, @Baha-sk .

Just checking my understanding here.

It is a core requirement of DIDComm that if Bob (recipient) has 5 devices under his control, and Alice sends him a message, the message is decryptable by all 5 of Bob's devices, where each device has its own key in the keyAgreement section of Bob's DID doc. It is also a core requirement of DIDComm that Alice can send the same encrypted message to both Bob and Carol at the same time, using the same encryption envelope. (Yes, they might be wrapped differently, but the recipients need to see the innermost encrypted message at the end as exactly the same bytes.) So when you say "no more than 1 key must be used per recipient", are you saying that "no more than 1 SENDER" key must be used per recipient? And is it your intention to allow the sender to use a different key for Bob than for Carol, if Bob and Carol are both recipients of the same message? Or is did you mean to say "no more than 1 SENDER key must be used per audience of a single plaintext message"?

baha-ai commented 3 years ago

It is a core requirement of DIDComm that if Bob (recipient) has 5 devices under his control, and Alice sends him a message, the message is decryptable by all 5 of Bob's devices, where each device has its own key in the keyAgreement section of Bob's DID doc.

if Bob has 5 devices, wouldn't he be using the same key on all 5 devices? Or should each device have its own key? If these devices communicate with a remote KMS that manages Bob's keys, then he can use the same key pair to identify all of Bob's devices. If on the other hand, each device uses a local kms where keys are only available locally, then your case makes sense @dhh1128. This also means Bob's DID document must be registered with all 5 KeyAgreement entries (1 for each device). Is this how we intend to manage keys for an agent?

My thinking was more about identifying 1 key for a recipient in the DID doc. This will simplify the interaction with the sender agent (Alice only need to know 1 recipient key when contacting Bob, not 5) and reduces DID doc updates as well.

If we meant to add unique device keys in the DID doc, then clarifications around the lifecycle of the DID doc is needed:

Does agent Alice really need to know that agent Bob owns 5 devices?
What if Bob changes, throws away or adds a new device to his existing list, does Alice agent needs to be notified?
What happens to the original DID doc exchanged with Alice in this case? Wouldn't this require a new invitation?

So when you say "no more than 1 key must be used per recipient", are you saying that "no more than 1 SENDER" key must be used per recipient? And is it your intention to allow the sender to use a different key for Bob than for Carol, if Bob and Carol are both recipients of the same message? Or is did you mean to say "no more than 1 SENDER key must be used per audience of a single plaintext message"?

I was referring to the recipient key to be a single key identifying the recipient in the list of recipients of the JWE message.

As for the sender key, a message can only have 1 key for the sender, shared/exposed to all recipients. Recipients' KDF and KW operations use the same sender key.

If a sender has multiple KeyAgreement entries in the DID doc, then a recipient must pull the one with ID found in skid (or APU b64 decoded). We only have 1 skid header in a message, so all recipients of a message use the same sender KeyAgreement entry for decryption. So even though a sender can have multiple keyAgreement entries in the DID doc, only one is used in a message.

If we follow the same concept of multi devices support for the sender as you suggested above for the recipient, we need to answer the same questions as well:

Does a sender (Alice) agent need to identify the device on which she's sending her message to the recipients (Bob and Carol)? ie do we need to use a device (local) key or a main agent (remote) key identifying Alice?
What happens if Alice updates her list of devices? do we need a new DID doc/DID invitation?
Do the recipients need to be aware of Alice's all devices keys?

I guess you get the idea where I'm going with this .. these points need to be ironed out in the DIDcomm docs.

baha-ai commented 3 years ago

Another main issue with multi keys for a recipient is message size: if say Bob, Carol, Joe, etc. recipients all have 5+ devices, then the list of JWE recipients will be n recipients x m+ devices. The list and the JWE message will grow quickly. Whereas if we use only 1 key per recipient, the list of recipients is always n.

Let me know what you think @dhh1128.

dhh1128 commented 3 years ago

if Bob has 5 devices, wouldn't he be using the same key on all 5 devices? Or should each device have its own key?

No, Bob will not use the same key on all 5 devices. Each device has its own key. (There is nothing in DIDComm that prevents Bob from copying keys from one device to another, but key sharing is bad practice and should be discouraged; DIDComm MUST be designed on the assumption that a key never leaves the device where it was created.)

This is not controversial; it's been an oft-stated assumption for the past 3 years.

dhh1128 commented 3 years ago

Does agent Alice really need to know that agent Bob owns 5 devices?

Again, this has been debated and re-debated. Answer: Yes, sort of. Alice doesn't actually know that Bob has 5 devices; she knows that Bob has 5 keys. He could manage 3 of those keys on a single device for all that Alice knows. And Bob could have shared one of the keys with a dozen devices; Alice doesn't know that either. But what she does know is that Bob wants her to make every message that she sends him decryptable by 5 keys. That, she must know.

What if Bob changes, throws away or adds a new device to his existing list, does Alice agent needs to be notified?

Yes, absolutely -- if and when Bob wants that new device to be able to decrypt his messages uniquely. Until then, Bob has the option of copying a key from an existing device onto the new device and not telling Alice (a key management antipattern that I don't recommend -- but none of Alice's business), or of having that new device be incapable of participating in the relationship with Alice (a choice that enterprises will often make, perhaps, since they may have hundreds of agents but only use a handful in the Alice relationship), or of having a different agent decrypt the message and then send it to his new device without Alice having to worry about it.

What happens to the original DID doc exchanged with Alice in this case? Wouldn't this require a new invitation?

No. Bob isn't changing his DID; he's changing his DID doc. Alice should be re-resolving the DID regularly, and a new resolution should pick up the changed DID doc. This same question arises (and has the same answer) if Bob just rotated a single key.

dhh1128 commented 3 years ago

I was referring to the recipient key to be a single key identifying the recipient in the list of recipients of the JWE message.

I'm strongly opposed to this idea.

dhh1128 commented 3 years ago

If a sender has multiple KeyAgreement entries in the DID doc, then a recipient must pull the one with ID found in skid (or APU b64 decoded). We only have 1 skid header in a message, so all recipients of a message use the same sender KeyAgreement entry for decryption. So even though a sender can have multiple keyAgreement entries in the DID doc, only one is used in a message.

Got it. Agreed. Thank you for the clarification.

dhh1128 commented 3 years ago

Does a sender (Alice) agent need to identify the device on which she's sending her message to the recipients (Bob and Carol)? ie do we need to use a device (local) key or a main agent (remote) key identifying Alice?

Device keys aren't different from agent keys. Alice looks to Bob like she has 5 keys. She sends with one of them. No further special communication/semantics are needed.

What happens if Alice updates her list of devices? do we need a new DID doc/DID invitation?

This is no different than adding a key to her DID doc. We need a new DID doc, but not a new invitation/connection.

Do the recipients need to be aware of Alice's all devices keys?

Only the ones she's going to use to send them messages, and the ones she's going to use to decrypt messages.

dhh1128 commented 3 years ago

Another main issue with multi keys for a recipient is message size: if say Bob, Carol, Joe, etc. recipients all have 5+ devices, then the list of JWE recipients will be n recipients x m+ devices. The list and the JWE message will grow quickly. Whereas if we use only 1 key per recipient, the list of recipients is always n.

Doesn't matter. This feature is absolutely required for safe multidevice support. We accept n x m. (This is why we encrypt the message body once, with an ephemeral symmetric key, and then encrypt the ephemeral symmetric key once per recipient key. If the body of the message is a 40 GB payload, we should get a single encryption of the 40 GB payload, and then n x m where n is the size of the tiny encrypted symmetric key. That feature, called "multiplex encryption", was always a feature of DIDComm. Have we lost it, such that we have to encrypt the 40GB m times?)

baha-ai commented 3 years ago

Thanks for the clarifications @dhh1128, multiple keys per recipient it is then.

This means we should be using multiple KeyAgreement entries in the did Doc to represent unique keys, one for each device.

ashcherbakov commented 3 years ago

It seems we have multiple items here which are not reflected in the DIDComm spec consistently now:

1) Send to one recipient with multiple keys (devices)

Case: Alice sends a message to Bob, and Bob has M keys as M keyAgreement entries in his DID DOC. In this case Alice wants to encrypt the payload just once with a CEK, and then encrypt the CEK for every Bob's key ("multiplex encryption" as Danield said).

Inconsistency in DID Comm spec:

It looks like the current version of DIDComm doesn't support this scenario (https://identity.foundation/didcomm-messaging/spec/#key-ids-kid-and-skid-headers-references-in-the-did-document says that There should be only 1 entry in the recipients of the envelope, representing Bob).
It doesn't specify how JWE message and headers should look in this case.
It's not clear what to use as kid and apv in this case.

2) Send to multiple recipients with one key (device) Case: Alice sends a message to Bob and Carol. Bob and Carol have 1 key as 1 keyAgreement entry in their DID DOCs. Alice may consider two options here:

2A: send the same message to Bob and Carol separately: generate separate CEKs; mention only intended recipient in the message and headers. Nothing specific here.
2B: send the same message to Bob and Carol so that they both now that the message has been sent to both of them: generate the same CEK and encrypt the payload just once; make both Bob and Carol part of the message headers (authenticated by Alice). This scenario is described in https://datatracker.ietf.org/doc/html/draft-madden-jose-ecdh-1pu-04#appendix-B.

Inconsistency in DID Comm spec:

the DID Comm spec doesn't mention option 2A and doesn't provide recommendations when to use 2A and when 2B.
DID Comm spec does contain information how to implement 2B (https://identity.foundation/didcomm-messaging/spec/#ecdh-1pu-key-wrapping-and-common-protected-headers), but only for authcrypt (ECDH-1PU) case. It doesn't say anything about multiple recipients in anoncrypt case.

3) Send to multiple recipients with multiple keys (device) Once we figure out 1) and 2), this case will be trivial. Currently the same issues as in 1:

Inconsistency in DID Comm spec:

It doesn't specify how JWE message and headers should look in this case.
It's not clear what to use as kid and apv in this case.

baha-ai commented 3 years ago

Thank you @ashcherbakov for the feedback.

It looks like the current version of DIDComm doesn't support this scenario (https://identity.foundation/didcomm-messaging/spec/#key-ids-kid-and-skid-headers-references-in-the-did-document says that There should be only 1 entry in the recipients of the envelope, representing Bob).

Indeed, the spec needs to be updated. The recipient agent should not be limited to one key. I'll rectify the wording.

It doesn't specify how JWE message and headers should look in this case.

I have yet to create examples in the DIDComm project for this. We currently have (outdated) examples in Aries-RFC334, but i need to revise them with the new ECDH1-PU draft4 update. I have updated the readme example of that RFC though.

It's not clear what to use as kid and apv in this case.

I'm currently updating our AFGO codebase with these kid/skid values. The initial discussion we had was to use the KeyAgreement's verificationMethod.ID as the kid/skid. But I believe since Aries already have an RFC on how to represent a key in the DIDComm service bloc, we should probably be compatible and use the same format, ie set kid/skid with the did:key representation of the KeyAgreement's key value for the same reasons mentioned in the RFC. This will simplify the use of keys and avoids fetching the full DID doc, loop through the list of keyAgreements to find if one of the IDs matches a kid.

I will update the docs to clarify kid/skid/apu/apv values.

3) Send to multiple recipients with multiple keys (device)

As per @dhh1128 earlier comments above, recipient can have as many keys in the recipients list, the sender should simply pick all the recipient's keyAgreements entries in the DID doc (must be set in the DIDcomm service bloc as well). Should we update the docs to make set in stone?

kdenhartog commented 3 years ago

@ashcherbakov comment just spurred a thought in my head that I hadn't even considered. We may want to send messages to our "own" devices when also sending it to external domain recipients to architect a "sync" feature within multiple devices of a domain. Effectively, our other devices are just additional recipients in this design architecture.

This isn't the only way to achieve this (could do syncing after decryption), but it definitely is one we should be considering on this.

ashcherbakov commented 3 years ago

Another (quite obvious) comment regarding the case with multiple keys (devices): the keyAgreement entries may have different types, so not all of them can be compatible with the sender's one. In this case we may need to add a comment to the spec, that the sender should do encryption for all keys (keyAgreement entries) that have the same type as the sender's one.

kdenhartog commented 3 years ago

Another (quite obvious) comment regarding the case with multiple keys (devices): the keyAgreement entries may have different types, so not all of them can be compatible with the sender's one. In this case we may need to add a comment to the spec, that the sender should do encryption for all keys (keyAgreement entries) that have the same type as the sender's one.

+1 This should be explicitly stated that this scenario is not acceptable. Due to the cryptography during the KDF functions it won't be possible unless the sender manages multiple skid which is just too much complexity.

ashcherbakov commented 3 years ago

@kdenhartog it looks like this issue should have PR Needed label, as the current speck lacks important details about multi-key and multi-recipients. Do you agree?

kdenhartog commented 3 years ago

+1 I think there's enough detail here and a decision made to make a PR at this point. Updated to reflect that. Thanks @ashcherbakov for your inputs on this.

ashcherbakov commented 3 years ago

@dhh1128 @Baha-sk @kdenhartog We discussed it internally with @vimmerru and got some recommendations you can consider:

It looks like there is a potential privacy disclosure and correlation issue if we use multiplex encryption for multiple DIDs (there is no issue to encrypt to multiple keys for one DID). Please find the issue description below, and the options how we can deal with this. I think that we will need to decide if we either

[Option1, Option2] support only authcrypt_anon (see below) where skid is always protected/encrypted, and the message can be sent to one DID only (but still be encrypted for all the keys related to the DID).
[Option3] support authcrypt_anon as well as authcrypt_open where multiplex encryption for multiple DIDs and unprotected skids are possible. In this case DID Comm spec must provide very clear instructions what are cons and pros of using every method; in what cases only authcrypt_anon should be used, in what cases authcrypt_open should or can be used.

Issue

DIDComm supports protection of the sender ID (see #219 for details; it can be achieved by encrypting the skid or by calling anon+auth crypt). But if we want to send a message to multiple recipients (N-wise DID for example) and use multiplex encryption for multiple DIDs, then all target DIDs are open and not protected. So, a mediator node immediately knows what DIDs have a connection.

Example:

Alice wants to send a message to Bob and Carol via multiplex encryption
Alice builds and encrypts the message, and sends it to Bob's mediator and Carol's mediator.
Bob's and Carol's mediator immediately know that Bob's DID and Carol's DID have a connection. This is a correlation.

Options how to deal with the issue

The issue above makes sense only when Alice wants to keep her skid secret. Otherwise the correlation can be done easily in any case. So, the proposal here is to consider two calls for authcrypt:

authcrypt_anon
- skid is always protected/encrypted
- the message can be sent to one DID only (but still be encrypted for all the keys related to the DID)
authcrypt_open
- skid is not protected/encrypted
- the message can be sent to multiple DIDs

Option 1: support authcrypt_anon only

DID Comm spec supports authcrypt only with protected skid, and hence doesn't allow multiplex encryption for multiple DIDs.
A sender needs to encrypt a message separately for every recipient (DID).

Pros:

no obvious correlation issues (I don't say that there are no correlation issues at all, that needs to be analysed; but at least correlation is not as easy as in other options)
less freedom for the user - less chances that user does something insecure or incorrect for his use case
any encryption algorithm can be used for authcrypt_anon including XC20P. This is because we use unique CEK for every recipient, and hence the issue raised in Draft4 of ECDH-1PU (last paragraph in link) is not actual.

Cons:

have to encrypt the payload for every recipient DID. It can be quite bad if the message size is huge, however
- symmetric encryption operation is very fast
- it will not be very common to send huge messages. Huge messages are usually processed via stream-like API, but the current DID Comm spec and JOSE doesn't specify how to do this.
N-wise mentions Multicast case, which will not be possible as-is.

Option 2: support authcrypt_anon only + shared CEK

DID Comm spec supports authcrypt only with protected skid, and hence doesn't allow multiplex encryption for multiple DIDs.
It's possible to use the same CEK (and hence do encryption of the payload just once) when need to send a message to multiple DIDs. In this case the message is encrypted just once with the same CEK for all the recipients, but then CEK is encrypted for every target DID separately, and the header has information about just 1 target DID. As a result, the same ciphertext is sent to every recipient (DID) separately.

Pros:

less chances for correlation issues
less freedom for the user - less chances that user does something insecure or incorrect for his use case
payload can be encrypted just once in case of multiple recipients

Cons:

As the ciphertext is the same for all recipients, the correlation is still possible
N-wise mentions Multicast case, which will not be possible as-is.
as CEK is the same for all recipients now, we can use only A256CBC-HS512 for content encryption, as Draft 4 of ECDH-1PU says.
Need to exclude the recipients from the aad (we have all recipients as part of to JWM header in any case).

Option 3: support both authcrypt_anon and authcrypt_open

Both methods are supported by DID Comm spec
DID Comm spec provides very clear instructions what are cons and pros of using every method; in what cases only authcrypt_anon should be used, in what cases authcrypt_open should or can be used.

Pros:

If the user wants to keep privacy and avoid correlation, it's possible to do so by using authcrypt_anon (see corresponding pros and cons above)
If the user wants to do a multicast or do encryption for the payload just once for all recipients, it's possible to use authcrypt_open

Cons:

more freedom for the user, so more chances that the user does something insecure or incorrect for his use case
not possible to do multicast or do encryption for the payload just once for all recipients and avoid correlation problems.

vimmerru commented 3 years ago

@ashcherbakov There is one practical argument agains multiple-recepients. Different recipients may just have incompatible KeyAgreement in DID Docs. For example, keys on different curves.

ashcherbakov commented 3 years ago

I would like to clarify a bit the comment from @vimmerru above. This is probably not an argument against multi-recipient in general, but a possible argument against multiplex encryption for multiple DIDs.

So, it can be considered as another advantage of Option 1 related to the problem with multiple keys of multiple types as described in #238.

Let's consider the following situation:

Alice has two keys: typeA and typeB
Bob has keys of typeA only
Carol has keys of typeB only
If Alice wants to send a message to Bob and Carol (N-wise), she can not do it via multiplex encryption for multiple DIDs, as there is no common key type (curve) that both Bob and Carol can use.
However, Alice can send the messages to Bob and Carol separately as defined in Option 1 just by using typeA key for Bob and typeB key for Carol

The example above actually means that a real N-wise communication is not really possible, as Carol can not communicate with Bob due to incompatible keys. So, we should probably address this in the scope of #238. On the other hand, in some cases it can be OK and expected that only Alice sends messages to Bob and Carol, while Bob and Carol doesn't communicate with each other. Moreover, Alice needs to use the same DID for both of them (N-wise).

dhh1128 commented 3 years ago

I acknowledge the correctness of the issue description, and also the good thinking about alternatives.

I would check one important nuance, however. I believe that all of the privacy issues that we're discussing with an unprotected skid are only privacy issues with respect to a mediator that is immediately in front of Bob's final recipient. That's because layers of wrapping hide the final skid from everybody else in the route, correct? So imagine the following delivery routes with an unprotected skid:

1. Alice -> Bob: no privacy problem (Alice already knows skid)
2. Alice -> Bob's agency -> Bob: Bob's agency learns the skid
3. Alice -> Bob's agency -> Bob's cloud agent -> Bob: only Bob's cloud agent learns the skid

Is this analysis accurate?

In general, I am okay with Option 1 (supporting only authcrypt_ano, and thus encrypting separately for every recipient DID. The fact that this makes multiplex work for all of the keys used by one recipient DID, but NOT for multiple recipient DIDs does not cause me a lot of dismay. We already have to wrap separately for every recipient DID, so encrypting separately for every recipient DID is not going to make the code a lot different. (I am prepared to give up my concern about the technicality I was previously arguing, which was about whether all the recipients could confirm that other recipients were able to decrypt a message. I think it's okay if the sender just asserts in the to header that the message has been sent to other parties besides the immediate recipient.)

My one hangup is that I am not okay with re-encrypting external attachments multiple times. For small attachments, I think it's fine to re-encrypt them. But if an attachment is large (my 40 GB example above), I want to be incorporate it by reference in a DIDComm message (by setting its data.link and data.hash properties instead of its data.base64 property).

dhh1128 commented 3 years ago

Maybe we create a function in our impls that's something like prepare_external_attachment(40gb_data) -> symmetric_key, encrypted and then we add an extra field to attachments, data.symkey. Now I can encrypt big stuff once and put encrypted somewhere external (e.g., a CDN), but all recipients of the message get the same symmetric key to encrypt it on download. I think this is basically the way the shared CEK mechanism works, but avoids some of the downsides of Option 2?

vimmerru commented 3 years ago

My one hangup is that I am not okay with re-encrypting external attachments multiple times. For small attachments, I think it's fine to re-encrypt them. But if an attachment is large (my 40 GB example above)

I think for large attachments we can find easy solution:

Just make AEAD with new key.
Share key inside of body with all parties

AnomalRoil commented 3 years ago

Different recipients may just have incompatible KeyAgreement in DID Docs. For example, keys on different curves.

Well, that's annoying in the case of ECDH-1PU, but for ECDH-ES, it's not an issue. Maybe the spec might include something about what to do when a recipient isn't having a key compatible to do ECDH-1PU... I feel like using a nested signature isn't really an issue in that case, is it?

vimmerru commented 3 years ago

I feel like using a nested signature isn't really an issue in that case, is it?

Spec explicitly prohibits this for now. And reason is you are loosing repudiation property of protocol.

But there is a different solution. If Alice doesn't have key compatible with Bob she can just generate such key, but it will work only for 1 to 1 communication, not for multicasting.

AnomalRoil commented 3 years ago

The spec seems to allow it:

§ 14. Message Signing A DIDComm message can be signed, either in conjunction with encryption or independently if the message will remain unencrypted.

If a message is signed and encrypted to add non-repudiation, it must be signed prior to encryption. This is known as a nested JWM.

This doesn't seem like it's forbidden. But it should be noted it can hurt repudiation. (BTW the repudiation section might need some more data... @dhh1128 )

but it will work only for 1 to 1 communication, not for multicasting.

Can you elaborate, I'm not sure I'm understanding your point here?

vimmerru commented 3 years ago

Specification here https://identity.foundation/didcomm-messaging/spec/#anonymous-encryption says

Use of Anonymous Encryption SHOULD NOT be paired with a method of message authentication other than Authenticated Encryption as defined in this specification. Further discussion of message authentication can be found in the Implementation Guide.

See issue https://github.com/decentralized-identity/didcomm-messaging/issues/242

Can you elaborate, I'm not sure I'm understanding your point here?

Let's consider Alice wants send message to Bob.

Alice has ed25519 based key agreement, Bob has P-384 key only
Alice can create new key P-384 and share it with DID method
After this Alice will be able to send message

but if Alice want send message to Bob (P-384) and Carol (X25519) it will be impossible as Bob and Carol don't have common crypto

AnomalRoil commented 3 years ago

I thought DIDComm for multiple recipients would be based on key wrapping. Please see ECDH-1PU Appendix B: https://datatracker.ietf.org/doc/html/draft-madden-jose-ecdh-1pu-04#appendix-B

The idea is the classical hybrid KEK/DEK mecanism:

you have a so-called Content Encryption Key (CEK) you're generating at random.
you encrypt your message/payload, the "JWE plaintext" using that CEK to get your ciphertext.
you then compute your "agreed upon" keys with each recipient key, these are "key encryption keys" (KEK). This needs to be done carefully to avoid possible impersonations, see cctag in ECDH-1PU section 2.3.
you encrypt the CEK once with each KEK.
you bundle the ciphertext, plus all the ciphertexts for the CEK and you get your encrypted JWE meant for multiple recipients.

So my point was that in step 3 you don't really need to have the same curve for all recipients.

baha-ai commented 3 years ago

So my point was that in step 3 you don't really need to have the same curve for all recipients.

this is true for anoncrypt where each recipient has a distinct epk that must be on the same curve as the recipient's key.

for authcrypt, all recipients keys and the epk must be on the same curve as the sender key since it's used in the kdf.

AnomalRoil commented 3 years ago

for authcrypt, all recipients keys and the epk must be on the same curve as the sender key since it's used in the kdf.

What is authcrypt for you? To me it meant "using ECDH-1PU".

And that's not clear to me that authcrypt has such a constraint if using ECDH-1PU as per its current draft 4. Especially if doing key wrapping.

As per the ECDH-1PU spec, section 2.1:

In Key Agreement with Key Wrapping mode, the JWE Authentication Tag is included in the input to the Key Derivation Function as described in section Section 2.3. This ensures that the content of the JWE was produced by the original sender and not by another recipient, as described in section Section 4.

So that's really the JWE Authentication Tag which is the output of the HMAC_SHA_512 MAC algorithm that is meant to be fed to the KDF as far as I understand and not the epk.

baha-ai commented 3 years ago

What is authcrypt for you? To me it meant "using ECDH-1PU".

that's right, for DIDComm we use 1PU with key wrapping mode only (not direct mode) to take advantage of message authentication for recipients (if a recipient doesn't have the correct key material for decryption, they can't process the message). 1PU KDF uses the sender key as input, once with an epk and once with the recipient key. You can only KDF using keys on the same curve, see third paragraph of section 4 Security Considerations.

So that's really the JWE Authentication Tag which is the output of the HMAC_SHA_512 MAC algorithm that is meant to be fed to the KDF as far as I understand and not the epk.

Correct, the authentication tag is just an input argument to the kdf in ECDH-1PU. This is why content encryption is done prior to recipient key wrapping.

baha-ai commented 3 years ago

anoncrypt on the other hand, is ECDH-ES+KW mode. In this mode, each recipient has its own epk on the same curve as the recipient key.

Since each recipient has their own epk in ECDH-ES, anoncrypt messages can have recipients keys with different curves/types.

This is not possible in authcrypt since we use the sender key in the kdf as I mentioned earlier which constrains the recipients keys curve/type to be the same as the sender key.

AnomalRoil commented 3 years ago

Yes, but then it's fine: as per its Appendix B, multiple recipients are handled by encryption with a randomly generated CEK and that's that key which is then encrypted using ECDH key agreement where the sender and receiver keys are part of the KDF.

So:

in "direct communication with a single party", authcrypt requires feeding the sender key in the KDF, which generates the key used for encryption the payload.
in a "multiple receiver message", key wrapping is used: we encrypt the payload using a random key, which in turn get encrypted as above for each recipient.

This works with different curves for different recipients, as the "main payload" is encrypted using a random CEK key. There is absolutely no overhead for supporting different curves for each recipients as long as the kid contains the data about which of their key was used.

See B.11 for a telling example:

ciphertext is the same for all recipients, it's the encrypted payload.
each recipients has a different "encrypted key" field for the encrypted CEK that allows to decrypt the ciphertext once they will have decrypted it using the tag in the KDF as per section 2.1.

TelegramSam commented 3 years ago

At the risk of wading into the middle of a crypto discussion in which I am fairly unqualified to contribute much...

I think the right answer is to state in the spec that multi-party interactions must all have the same key type for it to work. Negotiation of such key type alignment is out of scope for the spec. Attempts to do this should return an error to the user about incompatible encryption key types.

(While Alice may be able to send to Bob and Carol, Bob not being to directly converse with Carol can also be an issue that I'm not sure we want to solve in this version of the spec.)

AnomalRoil commented 3 years ago

I think the right answer is to state in the spec that multi-party interactions must all have the same key type for it to work.

That seems unnecessary to me, as it's not an actual requirement on the crypto side. I think we just need to agree what's the "multiple recipients" encryption method like and detail it in the spec some more. If it's as per ECDH-1PU Appendix B example, we don't have any issue with mismatched keys.

It would also fail to address the overarching issue: the spec is not clear about what should be done when we have multiple recipients, which is really the core issue here.

TelegramSam commented 3 years ago

That seems unnecessary to me, as it's not an actual requirement on the crypto side.

Uh, how is Bob and Carol not having compatible keys not a requirement?

It would also fail to address the overarching issue: the spec is not clear about what should be done when we have multiple recipients, which is really the core issue here.

I should have been more clear: The right answer to the "multiple recipients without matching keys" issue is as I described. We also need better text reguarding multilple recipients to serve the main point of the ticket.

AnomalRoil commented 3 years ago

Uh, how is Bob and Carol not having compatible keys not a requirement?

The current spec mandates that both ECDH-1PU and ECDH-ES should be supported, it also mandates that the curves P-384, P-256 and X25519 must be supported.

So all users of DIDComm should logically have a key on one of these curves, if we are using key wrapping as the method for multiple recipient encryption, which is roughly "many public keys are used to encrypt a single random CEK that is used to encrypt single content", then for each recipient, we have the guarantee that we can in the worst case generate a new key (ephemeral for ECDH-ES and long term for ECDH-1PU) that is compatible with a key from that recipients.

In your example:

Alice want to send a message to Bob and Carol, who have mismatched keys.
Alice generates a CEK at random
She encrypts her message with it, and gets her authentication tag from the authcryption algorithm
She uses her authentication tag along with her own key as input to the KDF to generate a encryption key KEK_B out of Bob's public key. (Nothing to do with Carol's key in that step.)
She encrypts using KEK_B the CEK.
She add the KEK_B-encrypted CEK to her message.
She uses her auth tag along with her own key as input to the KDF to generate an encryption key KEK_C out of Carol's public key. (Nothing to do with Bob here)
She encrypts the CEK again, but using KEK_C this time.
She add the KEK_C-encrypted CEK to her message.

Et voilà, Bob and Carol can both decrypt the CEK using their very own unrelated keys, each using possibly different encryption algorithm. They can then both decrypt the same content using that same CEK. And they both have sender authentication since the CEK they decrypted was coming from Alice for sure, and from the auth tag that auth the content.

Problems could raise if DIDComm is used to send messages to people who have a DID document without any key relying on a curve from the DID spec, such as secp251k1 for example.

TelegramSam commented 3 years ago

I'm sorry I was not clear and motivated a clear explanation. I fully understand that both Bob and Carol are capable of decrypting the message from Alice.

The issue I'm concerned with is Bob encrypting a message to Carol. Even if Alice can send to both Bob and Carol, there are many situations where Bob and Carol will need to send messages in response to Alice and each other.

I concede that there are some scenarios where that limitation is ok. I posit that there are many scenarios where that limition is NOT ok. Placing the limitation I proposed in the spec will place those out of bound for the time being, and allow us to avoid the complexities in the near term and solve these in a future version.

AnomalRoil commented 3 years ago

If Bob wants to send a message to Carole, he needs to look up her DID and might discover that they have a common algorithm, otherwise Bob can either:

use anoncrypt with an ephemeral key compatible with one of Carol's keys, adding a nested signature if auth is needed (possibly at the cost of deniability)
generate a new long term key for himself compatible with Carol's, hence solving the problem

The second option seems like the best to me.

This problem is exactly the same as when someone wants to send a message to a single party while not having any compatible key. I don't think this should be mixed with the multiple recipients encryption.

If Alice had not key compatible with Bob or Carol in my above example, she would have to behave in the same way as she were corresponding to a single party, just like Bob who might want to talk to Carol after Alice's message would have to go through the same process whether he's talking only to Carol or to both Alice and Carol.

How are you envisioning this in the 1:1 case when Alice wants to talk to bo Bob and they have no common algorithm? The solution shouldn't be "they cannot talk" IMO.

TelegramSam commented 3 years ago

The solution shouldn't be "they cannot talk" IMO.

I never suggested that they should not talk. I suggested that the process of negotiating keys is out of scope for this spec.

The solution (which I propose is out of scope for this spec) is that a process very similar to what you describe, but prior to engagement, less Bob and Carol be unable to negotiate a common key.

Opinions? Should we be addressing the multiple-party non-common key type situation in the spec at this point? @dhh1128 @kdenhartog @awoie @Baha-sk @troyronda et al?

baha-ai commented 3 years ago

That seems unnecessary to me, as it's not an actual requirement on the crypto side. I think we just need to agree what's the "multiple recipients" encryption method like and detail it in the spec some more. If it's as per ECDH-1PU Appendix B example, we don't have any issue with mismatched keys.

Append B example uses X25519 keys for Alice, Bob and Charlie. It wouldn't work if participants had different key types/curves due to kdf.

Our example implementation testing the decryption of this example is found in Aries Framework Go.

AnomalRoil commented 3 years ago

It wouldn't work if participants had different key types/curves due to kdf.

I don't see how this is the case.

The appendix B is even containing language that shows all recipients aren't required to be using the same key type:

Because Bob and Charlie are using the same curve (X25519), Alice reuses the same ephemeral key-pair for both recipients and includes it in the JWE Protected Header. If this was not the case, Alice should generate a separate ephemeral key-pair for each recipient and include it in each per-recipient header

Emphasis mine.

baha-ai commented 3 years ago

If this was not the case, Alice should generate a separate ephemeral key-pair for each recipient and include it in each per-recipient header

ie each unique epk with the same type/curve as as Alice's key. And since sender key is included in the kdf, all recipients must have the same key type as per the condition in Section 4:

When performing an ECDH key agreement between a static private key and any untrusted public key, care should be taken to ensure that the public key is a valid point on the same curve as the private key.

So when doing the KDF on the sender side, the two KDFs used to build the combined Z are: Ze = kdf(epk privKey, recipient pubKey) Zs = kdf(sender privKey, recipientPubKey)

And the same for the recipient size when decrypting.

As per the constraint above, this means the recipient public key must be on the same curve as the private key (both sender and epk's private keys). Am I missing something from the text above?

AnomalRoil commented 3 years ago

No, you're right: the sender and the recipient must have the same key type for ECDH to work.

My point is that in the multiple recipients case, you only need to have a common key type between the sender and each recipient, not across all of them.

If you have 2 recipients, Alice needs to have a key type in common with Bob and a key type in common with Carol, but these can be two different keys on Alice side.

There's no need for all of them to be using X25519. Actually your Go code seems to agree with me, it's treating each recipient key individually.

baha-ai commented 3 years ago

My point is that in the multiple recipients case, you only need to have a common key type between the sender and each recipient, not across all of them.

This assumes using multiple sender keys in the same envelope. We only support 1 sender key for packing though. Technically you can call Pack() for as many times you'd like with different sender key, but the recipients keys must match the curve/type.

There's no need for all of them to be using X25519. Actually your Go code seems to agree with me, it's treating each recipient key individually.

The JWEEncrypt/JWEDecrypter is for general JWE building. So it does support each recipient individually (for anoncrypt since the sender key is not involved in the kdf, so each recipient can have their key with a distinct curve/type in this case) or 1 common epk for all recipients (for authcrypt since kdf involves the sender key, so all recipients must have the same curve/type in this case).

AnomalRoil commented 3 years ago

The KDF for the KEK involves the sender key. The CEK isn't coming from a KDF in the multiple recipients case.

ashcherbakov commented 3 years ago

@AnomalRoil I believe the main source of the long discussion above is that, as @Baha-sk mentioned, the DID Comm spec assumes that

We only support 1 sender key for packing though.

So, if Alice has key1 of type1 that she can use for Bob, and key2 of type2 that she can use for Carol, but there is no common key type between Alice, Bob and Carol, she can not encrypt everything for the both Bob and Carol just in one message (multiplex encryption), as Alice needs to choose either key1 or key2 as her static key (skid field is not a list). So, she had to use the same static key for DF with Bob and with Carol.

But even if Alice could specify multiple sender keys and do multiplex encryption for the both Bob and Carol, we think that she should not do it, see https://github.com/decentralized-identity/didcomm-messaging/issues/218#issuecomment-889145630. It looks more safe to encrypt the message to Bob and to Carol separately.

decentralized-identity / didcomm-messaging

Need info about multiple recipients #218

Issue

Options how to deal with the issue