decentralized-identity / didcomm-messaging

https://identity.foundation/didcomm-messaging/spec/
Apache License 2.0
164 stars 57 forks source link

Lack of session and expensive key agreements steps could hurt adoption #243

Open AnomalRoil opened 3 years ago

AnomalRoil commented 3 years ago

It appears that DIDComm is sessionless on purpose, see §1.3.2:

In a partially disconnected world where a communication channel is not assumed to support duplex request-response, and where the security can’t be ignored as a transport problem, traditional TLS, login, and expiring sessions are impractical. Furthermore, centralized servers and certificate authorities perpetuate a power and UX imbalance between servers and clients that doesn’t fit with the peer-oriented DIDComm.

DIDComm uses public key cryptography, not certificates from some parties and passwords from others. Its security guarantees are independent of the transport over which it flows. It is sessionless (though sessions can easily be built atop it). When authentication is required, all parties do it the same way.

At the same time, it appears that DIDComm is trying to make building synchronous RPC systems easy:

Because of this, the fundamental paradigm for DIDComm is message-based, asynchronous, and simplex. Agent X sends a message over channel A. Sometime later, it may receive a response from Agent Y over channel B. This is much closer to an email paradigm than a web paradigm.

On top of this foundation, it is possible to build elegant, synchronous request-response interactions. All of us have interacted with a friend who’s emailing or texting us in near-realtime. However, interoperability begins with a least-common-denominator assumption that’s simpler.

But currently we are requiring parties to do for each message:

(Notice that these are built upon the original ECDH protocol meant to generate "shared session keys", and repurposed it into a mechanism to perform public key encryption of messages without having necessarily the notion of session.)

Being sessionless is nice because it allows stateless communications: we don't need to store the ephemeral key, nor the CEK key we're using to communicate with someone, and it also means we are not worried about long-lived keys, key rotation, IVs reuses, and so on.

But using the above algorithms can also hurt us on the performance side and cause scaling issues: as pointed out by @vimmerru, ECDH-1PU uses 2 ECDH per participant key... which is worsened if we want to hide the sender (see #219) as it could require layering anoncrypt on top of authcrypt, thus introducing an extra ECDH step.

This is going to scale poorly with multiple recipients as is being discussed in #218. In particular instant group messaging is more or less doomed if using DIDComm, which is a bit counter-intuitive since this repo is named "DIDComm-messaging".

This will even be worsened by the fact the we have relatively "high" security requirements and while X25519 performances might be acceptable, ECDH with P-384 is slow.

See for example this or that benchmark using crypto-bench:

test agreement::p256::generate_key_pair                       ... bench:       9,665 ns/iter (+/- 1,045)
test agreement::p256::generate_key_pair_and_agree_ephemeral   ... bench:      51,820 ns/iter (+/- 4,724)
test agreement::p384::generate_key_pair                       ... bench:     362,917 ns/iter (+/- 19,955)
test agreement::p384::generate_key_pair_and_agree_ephemeral   ... bench:     708,025 ns/iter (+/- 56,641)
test agreement::x25519::generate_key_pair                     ... bench:      19,816 ns/iter (+/- 1,746)
test agreement::x25519::generate_key_pair_and_agree_ephemeral ... bench:      73,051 ns/iter (+/- 7,357)

And that's on a "modern CPU" with modern AVX2 instructions using a rust implementation... It sure could handle a chat with 10 people having 4 devices each, as that's only like 10*4*3=120 handshakes per message in a worst case scenario. But that won't scale well with a 100 people or more. And that might hurt on JS or on embedded devices way earlier than that.

But notice that on more constrained devices, think IOT devices and such, one ECDHE key agreement can cost easily a few seconds, so sessionless DIDComm messaging would be totally impractical for such devices with our current design.

What are your thoughts?

@vimmerru proposed a few options such as:

  1. Cache ECDH result for permanent keys
  2. For anoncrypt(authcrypt()) mode Reuse ephemeral key for both operations
  3. Use KDF that will allow long living keys with random nonces. For example, XC20P

On my side I can clearly see why we might want to remain stateless, thus making options 1 and 2 inapplicable... but this doesn't prevent us from having session based on some kind of deterministic KDF steps. One option we might want to consider is to have a deterministic KDF being triggered by data we keep within the message and that allows us to re-derive (cheaply) the same keys and thus keep the ability to have session-like communication, but that's complicating things. (Also notice that elliptic curves are easy to derive child keys for from a parent key.)

So, are we aiming at being stateless or not? If not, the above proposals seems like the best options to me, but require us introducing the notion of session in the specification, in my opinion.

kdenhartog commented 3 years ago

This is a great writeup highlighting the tradeoffs we've accepted by way of the crypto we've selected. I think this is something we should be looking to at least make mention of within text and potentially include implementation notes about in order to make it more clear how it's expected to be handled.

TelegramSam commented 3 years ago

Great roundup on the topic. Thanks for that writing and work.

But that won't scale well with a 100 people or more. This is a good conversation. This being a messaging protocol doesn't imply scaled group messaging. We have not discussed our targets for group messaging efficiency. I've been thinking of large group messaging (larger than 10 maybe?) in a different way.

I'll have this on the agenda for Monday's WG meeting.

dhh1128 commented 3 years ago

DIDComm's multi-recipient feature is intended for a small number of recipients (e.g., 2-5). It is NOT intended for a group of 100. For that, we expect to use a hub-and-spoke topology (send a message to a hub; the hub resends to all the other parties). That is how most group chat technologies work.

This is discussed in two places that are part of the background tribal knowledge of the DIDComm community -- but not anywhere in the spec. Maybe we should add some of this to the spec.

Here are those two "tribal knowledge" sources:

Regarding the session-vs-sessionless question: I don't think we are going to convert to session-based. There is an idea of a "connection" in DIDComm that has some session-like properties -- but it is NOT the cryptographic session that Yolan is talking about here. It was much stronger in DIDComm v1, but has been deliberately watered down by the zero-round-trip/sessionless ideas in DIDComm v2. So going session-based may be dead in the water. However, the argument that a single key negotiation could take seconds on cheap hardware does give me pause. @AnomalRoil , is there any way to get hard data on that (e.g., "On this common smart card, the algorithm as currently proposed will take X millisecs.")?

I am quite intrigued by @vimmerru's suggestion that when we do anoncrypt(authcrypt()), we reuse the ephemeral key. I'm also intrigued by @AnomalRoil 's suggestion "have a deterministic KDF being triggered by data we keep within the message and that allows us to re-derive (cheaply) the same keys and thus keep the ability to have session-like communication." We already have the concept of a thread in DIDComm -- and this thread has a unique identifier with reasonable entropy. Could we use it to re-derive the same keys?

baha-ai commented 3 years ago

Note: anoncrypt(authcrypt()) means authcrypt() with kdf for recipients and skid for the inner payload (ie generated epk is for the authcrypt operation alone) and the output of this encrypted message is passed as the payload to anoncrypt() call which will do a totally new kdf (ie generates a totally new/independent epk for each recipient).

Anoncrypt() and Authcrypt() are 2 independent calls with the former using ECDH-ES+KW and the latter ECDH-1PU+KW.

Also note that the anoncrypt() result may be sent to mediators who do not know nor do they care who the sender is while the authcrypt() is intended for end recipients who must authenticate the sender. So the list of recipients in each call may vary.

Let's not talk about re-using/recycling epks or key derivations between calls please.

vimmerru commented 3 years ago

Also note that the anoncrypt() result may be sent to mediators who do not know nor do they care who the sender is while the authcrypt() is intended for end recipients who must authenticate the sender. So the list of recipients in each call may vary.

@Baha-sk I disagree with this statement. anoncrypt(authcrypt()) is a valid container for now used to hide sender info from mediators to get basic privacy and recipients will be the same for both layers. For now it is the only way for this. I think protecting skid header is much better for 90% of use cases, but it is dedicated issue.

Here you are talking about Forwad, but it is completely different message with wrapped message inside of body.

vimmerru commented 3 years ago

@AnomalRoil Can interactive handshake mode in ECDH-1PU be a solution https://tools.ietf.org/id/draft-madden-jose-ecdh-1pu-01.html#rfc.section.3?

AnomalRoil commented 3 years ago

@vimmerru interactive handshakes require being stateful... But it also mentions that the resulting CEK can be reused, which seem to not be the goal with DIDComm (?).

It's still not clear to me whether DIDComm is meant to work as a stateless protocol or not in between messages.

Let's not talk about re-using/recycling epks or key derivations between calls please.

Is there already a consensus on the topic or is that a topic that was already discussed a lot?

dhh1128 commented 3 years ago

It's still not clear to me whether DIDComm is meant to work as a stateless protocol or not in between messages.

@AnomalRoil : The paradigm here is to communicate between Alice and Bob, even when Alice and Bob each have multiple devices. So when Alice sends a message to Bob, she encrypts it for the key on Bob's iPhone as well as the key on his tablet. Now, suppose Alice's first message is seen only on Bob's iPhone, and the second message is seen only on Bob's tablet. (Maybe Bob lost his phone in the couch cushions in between the two events.) How would it make sense to establish a session in such a case? Wouldn't that just make the protocol more brittle?

baha-ai commented 3 years ago

@Baha-sk I disagree with this statement. anoncrypt(authcrypt()) is a valid container for now used to hide sender info from mediators to get basic privacy and recipients will be the same for both layers. For now it is the only way for this. I think protecting skid header is much better for 90% of use cases, but it is dedicated issue.

Here you are talking about Forwad, but it is completely different message with wrapped message inside of body.

you're right @vimmerru, Forward message is different, I did mix up Anoncrtyp(Authcrypt()) with Foraward messaging. I take this statement back.

Is there already a consensus on the topic or is that a topic that was already discussed a lot?

@AnomalRoil Anoncrypt and Authcrypt has always existed since the onset of DIDComm. When sender identity must not be revealed, Anoncrypt should be used, othwerise use Authcrypt by default. The novelty here came from the need to protect the skid, a protected header introduced in ECDH-1PU which is targeted for DIDComm V2 as the Authcrypt crypto key derivation algorithm.

One proposed solution was to add a special encrypted_skid header, but this requires another cipher operation and is not standard. So to resolve the issue, the idea of doing Anoncrypt(Authcrypt()) was accepted, but there's no consensus on sharing key derivations between the calls as they're 2 separate operations.

Also note Anoncrypt (ECDH-ES) uses an epk per recipient while authcrypt (1PU) has only 1 shared epk for all recipients since the cipher tag is used in the kdf to protect the sender's authenticity from being impersonated by an untrusted recipient. So sharing key derivations don't necessarily work between the two calls and deviates from standard/commonly known crypto algs.

AnomalRoil commented 3 years ago

Ah, I think there might be a misunderstanding there.

I'm not talking about reusing keys between calls for the same message, but more about "what do we do when messages are being sent back and forth". So, as soon as we have a "session" or a "thread" of messages, are we re-using the same CEK, are we re-deriving new keys for each messages, what is the goal?

For instance, ECDH-1PU has a notion of "interactive" handshakes as pointed out by @vimmerru which says:

After the initial message and a reply have been exchanged, the two parties may communicate using the derived key from the second message as the encryption key for any number of additional messages

But that's not necessarily in line with DIDComm goals as far as I can tell. So, these interactive handshakes are not meant to be supported in DIDComm, are they?

Maybe the answer to the question "how do we advice people to do when they want to establish a (somewhat synchronous) session" might just be "establish a shared secret using DIDComm and then establish a secure channel using something else than DIDComm"... But then I'm not sure why people doing that would use DIDComm in the first place since most secure channel protocols have some kind of key agreement protocol already.


@dhh1128 regarding your question about threads, I guess using the thid or pthid to derive keys might work. It would totally be possible to rely on these to derive ephemeral keys that can then be re-generated upon need when receiving a message meant to talk to that ephemeral key in the case we want to be stateless.

That being said it could easily hurt forward secrecy depending on how it's done, and therefore is not necessarily a good idea.

AnomalRoil commented 3 years ago

Oh, @vimmerru interactive handshake was actually removed in draft 2 of the ECDH-1PU draft, so that's not really something we need to be concerned about I guess.

But still, my question about whether DIDComm communications are meant to be stateless or not necessarily remain. It seems the notion of threat with thid and pthid would make them "optionally" stateful, but the rest of the spec seems to mean that the "default" behaviour is stateless.

baha-ai commented 3 years ago

thid and pthid are at higher level than the encryption layer.

They're part of the payload to be encrypted, the encryption layer building the envelope (JWE message) should not parse the payload.

AnomalRoil commented 3 years ago

To come back to my initial concern about performances of the underlying crypto and it being a bit rough if we don't have a notion of "session" or "shared secret" that can be re-used, here are a few links with data:

But CPU speed is always going to be a blocker on embedded devices with very small frequencies. For such devices, a "handshake" operation to establish a common key and then ongoing communication using that share key with maybe a key rotation every now and then is mandatory.

Regarding @dhh1128 question about smart cards, these are even more difficult to find data about... But here's something:

But in keep in mind that IOT != Smart cards.

Smart card devices can have dedicated hardware acceleration for specific crypto schemes, since they are usually meant to do crypto securely... IOT devices are usually using generic low-power chips meant for embedded devices, which might have some crypto accelerator, but that's more the exception than the rule there.


Anyway, to come back to the initial topic I really have three concerns:

  1. we are requiring a lot of ECDH computations all the time, which seems wasteful.
  2. we don't have a pre-shared key setting or option allowing for ongoing sessions to re-use a past key at the cost of PFS from a message to another in that session.

These two points are going to mean that DIDComm is unfit for embedded devices and IOT devices running on low power chips.

  1. I find that P-384 is an odd choice, especially if we are to do so many ECDH operations all the time. P-256 made more sense wrt performances. X25519 is fine, as it's one of the fastest scheme out there. From a security perspective I don't see any reason to have P-384: X25519 has a security level of ~127 bits against dlog attacks. So unless having P-384 was meant to increase the security level (which I didn't see mentioned anywhere), I don't really feel like it's really "better" than P-256, especially since the latter has a lot of optimised implementation out there while P-384 doesn't.
baha-ai commented 3 years ago

Key agreement may be expensive, but it's needed to authenticate messages for recipients and avoids adding JWS to the payload.

TelegramSam commented 3 years ago

Another few general comments. This is pure personal opinion, and not something I claim is shared by the entire community.

DIDComm is not trivially a replacement for other transport protocols. DIDComm's foundation and application holds the potential for a dramatic adjustment of communication topology and the rebalance of power on the internet.

It is important to establish a strong and not overreaching foundation for this new technology. I fully expect success to drive improvements, including adaptations for IOT and improvements in efficiency. I'd love to have those problems.

The current overwhelming barrier to adoption is not having a finished spec.

Is encryption efficiency or IOT applicability an important issue? Yep. Those sound like perfect areas to focus on in the next version of DIDComm.

kdenhartog commented 3 years ago

Another few general comments. This is pure personal opinion, and not something I claim is shared by the entire community.

DIDComm is not trivially a replacement for other transport protocols. DIDComm's foundation and application holds the potential for a dramatic adjustment of communication topology and the rebalance of power on the internet.

It is important to establish a strong and not overreaching foundation for this new technology. I fully expect success to drive improvements, including adaptations for IOT and improvements in efficiency. I'd love to have those problems.

The current overwhelming barrier to adoption is not having a finished spec.

Is encryption efficiency or IOT applicability an important issue? Yep. Those sound like perfect areas to focus on in the next version of DIDComm.

That's essentially the take away I walked away from when I couldn't get consensus on this a few IIW's ago. This has always been an issue in the back of my mind, but I came to realize oh well, we can optimize as needed rather than try to fit everything in right away.

kdenhartog commented 3 years ago

DIDComm's multi-recipient feature is intended for a small number of recipients (e.g., 2-5). It is NOT intended for a group of 100. For that, we expect to use a hub-and-spoke topology (send a message to a hub; the hub resends to all the other parties). That is how most group chat technologies work.

It's worth noting this is why MLS (message layer security) was so interesting to me. Their tree-kem structure allowed for large groups while maintaining perfect forward secrecy and post-compromise security. We could look to adapt these capabilities in future protocols as a DIDComm-Large-groups where an RFC to adapt the cryptographic primitives into the JWE structure is defined similar to how 1PU has done. Similar thinking has been considered for the axolotl ratchet scheme from signal. It's my opinion that it would be best for this group to not bring this into scope at this point though and leave it for a later date.

andrewwhitehead commented 3 years ago

Agreed that secure communication among large groups should be out of scope for now. It requires additional protocols to perform key updates among the members, and is generally a complex problem with various performance trade-offs. If somebody needed to do this then I think the best options would be to communicate using an existing secure group chat like Matrix (or something based on MLS), or having a central authority distribute the messages to group members.