decentralized-identity / didcomm-messaging

https://identity.foundation/didcomm-messaging/spec/

Apache License 2.0

168 stars 58 forks source link

Sending Messages to DIDs vs to keys and endpoints contained in DID Documents #29

Closed TelegramSam closed 2 years ago

TelegramSam commented 4 years ago

My mental model of sending messages has been that they come from a key, and inbound messages are sent to a key (or several) at an endpoint. In a situation where I have an agent on my laptop, phone, and tablet, (each with different keys and possibly endpoints), how should messaging be framed? Do all messages get sent to all keys at all endpoints?

TelegramSam commented 4 years ago

@dhh1128 ^

dhh1128 commented 4 years ago

I think we need the nuance of network routing vs. cryptographic routing to answer this question wisely.

At the layer of the network, a message is sent from Alice to Bob. Where it originates inside Alice's domain is uninteresting; it is also uninteresting to the sender how it gets routed, network-wise, once it enters Bob's domain. This is analogous to our experience with physical mail. We put an address on the outer envelope and drop it in the mailbox. Eventually it arrives at Bob's mailbox. Whether Bob's roommate picks it up and drops it on Bob's bed, and whether it sits on his desk waiting to be sorted for a week after he gets it, and whether Bob actually opens it up while he's on a backpacking trip through the Rockies instead of opening it while he's sitting at the address we mailed it to, are uninteresting to us. And similarly, Bob has no interest whatsoever in where the letter was before Alice dropped it in the mail system.

That's the networking side of things.

Now, the crypto side of things is a bit different. Here, we have to make slightly different distinctions. There isn't a single key that represents Alice or Bob; instead, there are many keys for each of them. And these keys may have different permissions.

The sender of a message has to encrypt and/or sign with a specific key. This reveals which key is behind the emitted message, in a cryptographic sense, and thus what trust Bob should impute to it. However, it does NOT reveal which key actually sent the message physically on the network. For all we know, Alice's key 4 that encrypted the message actually did encryption only. Then Alice moved the message by thumb drive to the agent that uses key 5, and asked that agent to do the sending.

On the recipient's side, Alice can't know which device or agent in Bob's world is going to handle a message, physically. But she plan for who's going to decrypt it. So, what's the most practical, useful assumption that Alice can make about the decryption part?

By default, Alice should assume that any agent in Bob's domain that has the plaintext privilege needs to be able to decrypt. This lets Bob handle one message from Alice on his iPhone, and the next message on his tablet. That's a Bob decision, not an Alice decision.

If you buy this, then the optimal behavior with respect to cryptographic (not network) routing is to multiplex encrypt for almost all of Bob's agents that Alice knows about.

Note the "almost." I am assuming that in Bob's DID doc, he has specified that his cloud agent doesn't have the plaintext privilege. This means that Alice doesn't encrypt in a way that Bob's cloud agent can decrypt. It is the lack of this privilege that causes Alice to wrap her multiplex-encrypted message to "Bob" inside a forward message prepared for Bob's cloud agent. If, on the other hand, Bob's cloud agent DOES have the plaintext privilege, then Alice encrypts it in such a way that the cloud agent can decrypt, and she no longer uses a forward message, because that agent is no longer a mediator from her perspective. If Bob wants to configure it to relay messages to his edges, he can do that, but Alice is uninterested. In other words, Alice doesn't really know that Bob has a "cloud agent" that is doing routing; what she knows is that Bob has declared a mediator that by definition lacks the plaintext privilege. All other knowledge about delivery is hidden from Alice, and configured when and how Bob likes. He can change his mind about that configuration even after Alice sends her message, and she doesn't need to care.

TelegramSam commented 4 years ago

We put an address on the outer envelope and drop it in the mailbox. Eventually it arrives at Bob's mailbox.

What if Bob spends time at both his Mom's house and Dad's house? What if Bob sometimes prefers personal mail to be delivered to his office, instead of just his house? What if Bob is in the middle of moving from one house to another, and he needs deliveries to both, at least during the transition?

I think we need to discuss what (in practical terms) a domain is. If we require that a domain only has one endpoint (mailbox), then our discussion is quite different than a multi-endpoint domain.

On cryptographic routing: You propose that every message be encrypted to all 'trusted' keys (those with the plaintext privilege). This imposes a great deal of coordination between the agents within a domain on every inbound message. Some protocols may need this, but I suspect many will want to coordinate which agent is involved and leave the rest out until that flow is transferred to another agent.

Certainly, new communication (threads, if you will) might be best initiated by sending to all agents, and then only continuing with the one that responds.

Thoughts @dhh1128? Can we / should we accommodate both strategies?

dhh1128 commented 4 years ago

What if Bob spends time at both his Mom's house and Dad's house? What if Bob sometimes prefers personal mail to be delivered to his office, instead of just his house? What if Bob is in the middle of moving from one house to another, and he needs deliveries to both, at least during the transition?

He can ask Alice to send (tee) to two endpoints, if we want to require that of senders. Or, more simply and more robustly, he can arrange for one of his endpoints to tee to the other, without involving Alice at all. That might be better. This is how multiple email addresses work today; we set up autoforward from one to the other. We also sometimes require the sender to email us at two locations, but that often causes problems due to the unpredictability of the sender's behavior.

I think we need to discuss what (in practical terms) a domain is. If we require that a domain only has one endpoint (mailbox), then our discussion is quite different than a multi-endpoint domain.

Agreed that this is a good discussion. My own feeling is that a domain can have multiple endpoints that are distinguished by transport (here's how to send to me over bluetooth; here's how over http; here's how over smtp). And maybe a domain can also have multiple endpoints of the same transport type (send to A; if that fails, send to B). But I'm not sure how fancy we should get; there's a point of diminishing returns, which is quickly overwhelmed by the complexity we're imposing on senders.

On cryptographic routing: You propose that every message be encrypted to all 'trusted' keys (those with the plaintext privilege). This imposes a great deal of coordination between the agents within a domain on every inbound message.

On the contrary; I think this imposes little burden at all. I suspect that although Bob has 5 devices, he only wants to interact on protocol X with one of those devices at a time. So that device can either A) be the only device that shows up and runs the pickup protocol to retrieve messages; or B) tell his router that it's currently active, and all the others are not. A is trivial and B is not much harder. Both of these choices can be made without telling Alice anything.

TelegramSam commented 4 years ago

Or, more simply and more robustly, he can arrange for one of his endpoints to tee to the other, without involving Alice at all. That might be better.

I agree this is a good pattern to follow, but it doesn't serve a few cases very well. If Bob is currently in the process of transitioning from one endpoint provider to another, he may have two for a time. It also doesn't serve Bob if he uses agents from different vendors, and each vendor provides it's own endpoint.

On the contrary; I think this imposes little burden at all. I suspect that although Bob has 5 devices, he only wants to interact on protocol X with one of those devices at a time. So that device can either A) be the only device that shows up and runs the pickup protocol to retrieve messages; or B) tell his router that it's currently active, and all the others are not. A is trivial and B is not much harder. Both of these choices can be made without telling Alice anything.

We have had no discussion yet about the pickup protocol being able to pickup messages from only one protocol (message types should still be encrypted), nor about informing a router which agents are active. All of those are great possibilities but are currently unexplored.

I suspect we are running into another hidden assumption: If a domain is required to have a cloud agent, or if a multi-agent domain can exist without one, being simply routed messages by mediators.

...the complexity we're imposing on senders.

This is certainly the issue we need to balance: What happens internal to a domain, and what happens outside the domain (impositions on senders) to support that.

TelegramSam commented 4 years ago

Daniel and I have been bantering here. I'm also curious about the thoughts of @swcurran, @tplooker, @OR13, and others.

OR13 commented 4 years ago

I'd prefer to: did:example:123#key-0...because a resolver will always return the did document for did:example:123... because fragments are ignored... the client is responsible for knowing how to get the key from the did document if the id is built with a fragment.

swcurran commented 4 years ago

A couple of thoughts on reading this:

Alice does what Bob tells her to do wrt to where to send the message/multiplex based on the DIDDoc she has from him. I think that largely handles the nuances in the above conversations (e.g. "Alice send to most of Bob's devices") -- Bob tells Alice where to send responses, period.
If Bob provides multiple service elements, then I think one (or more?) must be marked as primary and Alice should always use that (those?) lacking any other information.
- The service decorator, or it's successor (based on last week's call) can override that default on a message by message basis. I think this handles the "which transport to use" case.
  - Likely we need a form of the service decorator that can be used to request the use of one of the non-primary service elements in the DIDDoc.
- The service decorator can be used to provide an entirely new, temporary place to send a message.
- Alternatively, for a permanent change, Bob can send a DIDDoc update to change the primary service element(s) or make other updates.
If Bob wants multiple of his devices to receive the message, he puts multiple entries in the "recipientKeys" array, and the mediator immediately before that must know where to send the messages and sends all the intended recipients the message
- In the special case of there being no routingKeys (e.g. no mediator), then Alice has to send it to each of the recipientKeys and she has to know to what endpoint to send the message for each of the entries in the array, which is tricky.
- This feels really kludgy to me, but in theory, that (weird) use case could be handled by having multiple primary service elements so that you wind up with one entry in the recipientKeys array for each and each has it's one endpoint. That's why I added the "one (or more?)" in the above about service elements.
- **Better idea*** Another way to handle this is that Alice sends it to the one endpoint she knows, and whoever receives it is responsible for the delivery to the other recipients.

I think I like that the "to" to be the did+hash-mark fragment (did:peer:...#key1) vs. a naked key or a "did:key". If we fully implement "did:peer" (which I think we should), the use of "did:peer:....#key-1" is viable. Presumably this would also be the case for the routingKeys array, with the "did:peer..." referencing the did between the mediator and the next hop in the routing.

Note that were we use naked keys in protocols - e.g. if the service decorator survives in (more or less) it's current form -- it would use a "did:key".

kdenhartog commented 4 years ago

To me this reads as a bit of a jumbled argument with little storytelling effect. Please excuse the inelegance of how I'm saying this and find the deeper meaning of what I'm saying.

I get the hunch that we're conflating cryptographic keys (e.g. numbers that go in mathematical functions) and identifiers and this is what has made this discussion so hard to find a solution to. One of the traits that suggest we're doing this is because we speak about routing as if it's cryptographic, but routing doesn't need cryptography. With that in mind, I'm wondering what are the specific benefits to making an identifier (e.g. which node comes next in the path to the traverse the graph) more constrained? In any case how we're sending "keys" today really are just identifiers because cryptography doesn't operate in base58. As such the only different between a base58 encoded key and did#key1 is the dereferencing process. In the base58 case I just have to decode locally where as in the did identifier case I have to follow some specified method. In the case of did:key is 4 extra steps - parse the first character of the method specific identifier, parse the curve name, find the trailing bit, splice the rest off and decode it. In the case of many others it likely means making a network call.

^ This is the point I forgot when we were on the call last week that Tobias and I discovered. @llorllale

swcurran commented 4 years ago

AFAIK, there are two uses for the key/identifier that we are using. If the agent is the receipient it uses the identifier to get the private key to decrypt the message. The second is to route the message on, by look up somewhere to get the data needed to send the message. Either way, the use of the string is to find related data. The extra processing you mention is not really important vs. time to lookup using the string to find what you really need. Performance if that will then be about caching effectiveness.

TelegramSam commented 4 years ago

I've written up the following to organize my thoughts. Forgive the declarative nature of the writing: It is merely a proposal and of course open to feedback and criticism.

How to send a message

We explore Alice sending Bob a message. Bob has two listed DIDComm services:

S1

Endpoint: https://example.com/s1
routingkeys: [r1, r2]
recipientkeys: [k1, k2, k3]

S2

Endpoint: https://otherexample.com/s2
routingkeys: [r3]
recipientkeys: [k4, k5]

We ignore that Alice may have routing steps on her own that she wants to add prior to endpoint delivery. We also ignore any services not of the DIDComm type, whatever that ends up being.

Process to Determine Route

Is Alice replying to a message that contained a service decorator? -- Send to info contained that decorator
Is Alice replying to a message that indicated the reply-to key? -- Use that key to lookup service block with indicated key, encrypt to key, deliver to endpoint using routing keys
Else
- Deliver to all endpoints. For each endpoint, prepare message encrypted to all recipients. Multi-encrypted message is delivered using routing keys

Strategies this allows

Forwarding messages within a domain
Using a cloud agent for final transmission to endpoints
Asking Sender to send to a specific place
Expecting sender to send to all endpoints and keys
Only using a single mobile agent and mediator without cloudagent

Domain Examples

Mobile with mediator
Mobile and laptop with same mediator
mobile and laptop with different mediators
mobile cloudagent with mediator
mobile laptop cloudagent with mediator

dhh1128 commented 4 years ago

It seems like we are still conflating the cryptographic routing with the network routing. These are two separate problems. They do interrelate, but unless/until we pick them apart, this will continue to feel mysterious. Once we separate them, the principles allow lots of smart variation without complex rules.

The confusion

The cryptographic route determines who can see which messages in plaintext. The thing we're calling recipientKeys in the service decorator maps to this only in the clumsiest way, because it creates an artificial distinction between seeing forward messages in plaintext, and seeing other application-level messages in plaintext.

The network route determines who actually gets any form of payload (encrypted or plaintext, forward or otherwise). A lot of a network route is invisible, from any given perspective, and is dynamic -- but I see us wanting to mix it in and either spell out too much of it, or find magic assumptions that keep it rational when it's not explicit. I don't think we'll succeed.

The thing we're calling routingKeys in the service decorator sort of maps to the network routing construct, but the label encourages a faulty mental model, because it equates routingKeys with routing. In fact, much routing happens without any of these keys. (Even mediation could happen with keys not listed here, but it would be internal mediation, not visible to the sender. The only kind of mediation we need the sender to know about is external. Contrast scenarios 6 and 7 here.)

When Alice talks to Bob and external mediators are involved, she's actually talking 2 different protocols simultaneously (sort of like TCP and IP). One is the mediation protocol; the other is the higher-level application protocol that the mediated content carries. The recipient of the mediation protocol payloads is the mediator chain, not Bob's edge. The recipient of the higher-level protocol payloads is Bob's edge, not Bob's mediators.

The principles

The portion of either route that precedes the message's arrival at the recipient's first mediator are uninteresting. The sender and/or other parties arrange them per circumstance and preference.

A recipient can have any number of external mediators: 0, 1, or 2+. (Plus they can have internal mediators, but we can safely ignore them.) However many external mediators they have, they must declare the keys for these mediators so the sender encrypts+wraps-with-forward for each one. A name that correctly sets the mental model for these types of keys would be something like externalMediatorKeys. Calling them routingKeys is horribly confusing, because parties that possess keys and help with routing may not be listed, and parties that don't possess keys but that also help with routing are also not listed.

The innermost layer of the onion is the encrypted application-level message that's NOT forward-wrapped (the payload for the other, higher-level protocol); this should (almost) always be mutliplex-encrypted for all of the recipient's agents that have the plaintext privilege and that are known in the relationship. It should be pretty rare to ever override this set of target keys in a stable relationship. In an ephemeral interaction, listing these keys is reasonable, and a name that sets the correct mental model for these keys would be finalKeys.

External mediator keys get forward messages. Final keys get the message encrypted for them, and are trusted to see what it says in plaintext.

If this doesn't seem to hang together, let's discuss in an interactive call.

TelegramSam commented 4 years ago

I'm happy to rename routingKeys to externalMediatorKeys. I think recipientKeys is more descriptive than finalKeys, but I don't deeply care.

I think the fundamental point of disagreement is which keys are used in the multiplex-encryption: I propose that we allow the recipient to ask the sender to send to a specific 'key' (or keys), and it appears you wish to disallow that behavior. (That behavior is outlined in the middle bullet of my process comment above.)

swcurran commented 4 years ago

I'm lost on the what we're confused about :-).

We seem to be all in deep agreement on the flow and purpose of the chunks of data. I'm not fussed on the names and would tend to keep the ones we have because nothing is going to be clear by just looking at the names.

Where are we on the question on the form of what is to go into the two *Keys arrays?

dhh1128 commented 4 years ago

I'm lost on the what we're confused about

I feel that what we're confused about is the reasonable duties of a sender, and the corresponding amount of knowledge we must expose to help the sender in those duties. When we use phrases like "the route" and "the message is sent" or "the message is delivered" without being clear about whether we're talking about a cryptographic or a network route, I think we're making things muddy.

it appears you wish to disallow that behavior

Not quite that strong. I wish to strongly discourage it (except in the ephemeral case) because it normalizes two negative things: A) the sender knowing too much; and B) the sender having too much responsibility.

nothing is going to be clear by just looking at the names

Good names won't solve every challenge with communicating the subtleties of the routing model. No pushback there. But bad names can definitely make things worse. I am not arguing strenuously in favor of the names I threw out; I am arguing strenuously against our current names, because they put people's heads in the wrong place.

dhh1128 commented 4 years ago

To be clear, the reason I'm so strong on this point--that we're asking the sender to know too much--is because it's directly related to the title of the issue. The question the issue asked was, "Do we send messages to DIDs or to keys?" And I am claiming that in an established relationship, the strongly preferred default ought to be to send to DIDs. We need the ability to send to keys in ephemeral mode, and maybe in some rare corner cases with established connection -- but sending to keys for established connections feels like really bad design to me.

I brought up the confusion thing because I think people have come to the opposite conclusion (sending to keys is totally fine) due to a lack of clarity about whether a route is cryptographic, network, or both.

kdenhartog commented 4 years ago

I agree that it makes sense at a routing layer to "send to DIDs". For me, I conflate "send to keys" because the sender has to be aware of which keys belong to the did I'm sending to (e.g. the cryptographic layer), at which point if I want to do some level of exclusion on a per device basis (which are represented as keys), then "send to a DID" seems like a conflation because we actually are "sending to keys" at that point. However, consideration of this thinking (excluding certain devices/keys in a did) would probably be useful to exclude initially. I think this granularity will absolutely be necessary in due time though to build "principle of least authority" in did documents.

On a side note, this discussion is definitely getting a bit tricky over github comments. I'd be in favor of trying to tease this out over a call because this is one of the major TBD issues in my mind that I still can't wrap my head around.

TelegramSam commented 4 years ago

We discussed this issue on Monday, 18 Feb 2020 in the DIDComm WG Call. Revised notes to follow on this issue.

TelegramSam commented 4 years ago

How to send a message

We explore Alice sending Bob a message. Bob has two listed DIDComm services:

S1

Endpoint: https://example.com/s1
routingkeys: [r1, r2]
recipientkeys: [k1, k2, k3]

S2

Endpoint: https://otherexample.com/s2
routingkeys: [r3]
recipientkeys: [k4, k5]

we ignore that alice may have routing steps on her own that she wants to add

General Principles of Sender

Encrypt to all recipient keys
indicate attention keys if requested by recipient
Transmit to endpoint

General Principles of Recipient

Coordinate between agents and/or mediator which agent should handle the message
Indicate which keys should be marked as attention keys if desired

Process to Detirmine Route

Is alice replying to a message that contained a service decorator? -- Send to info in that decorator
Is alice replying to a message that indicates attention keys? -- encrypt to all, mark with attention info, transmit.
Else: -- Encrypt to all, Deliver to all.

Strategies this allows

Forwarding messages within a domain
Using a cloud agent for final transmission to endpoints
Asking Sender to send to a specific place
Expecting sender to send to all keys
Only using mediator without cloudagent

Domain Type Examples

Mobile with mediator
Mobile and laptop with same mediator
mobile and laptop with different mediators
mobile cloudagent with mediator
mobile laptop cloudagent with mediator

TelegramSam commented 2 years ago

I believe this issue has been resolved, but not recorded in this issue.

We are using DIDs, not keys, for all relevant parts of this problem. The keys contained within the referenced DID Document are used for the relevant cryptography.

Peer DIDs (method 2) may be used when a key and endpoint needs to be communicated without recording it on a ledger.

I feel comfortable closing this issue.

TelegramSam commented 2 years ago

Closed after discussion in WG 20220110.