decentralized-identity / keri

Key Event Receipt Infrastructure - the spec and implementation of the KERI protocol
Apache License 2.0
73 stars 21 forks source link

Revised KSN key state notification message #130

Open SmithSamuelM opened 3 years ago

SmithSamuelM commented 3 years ago

In implementing the "ksn" key state notification message I found that not having the "s" for sequence number field at the top level was jarring and confusing due to the inconsistency with every other message type which does have it at the top level. Indeed when leveraging code for creating messages or when reasoning about the code it kept tripping me up not to find it there. We have several fields as the top level in the key state message that are also at the top level in other messages but not the "sn" field. This lack of consistency is also bubbled up into the parametrization of utility functions etc.

I don't think any other implementations have implemented the key state message so I don't think it would affect anyone but the python implementation and fixing it is worth the improvement in consistency with other messages and the improved. coherence that results in reasoning about the code. It also moves three fields essentially used to compare key state with key events in a log back to the top level instead of nesting them, thus simplifying the comparison code syntax.

I understand that we added a nested dict to hold fields from the latest current event in the key state, but the reason for adding that dict was because the "t" field for the latest current event type 'icp' or 'rot" would conflict with the "t" field for the 'ksn' of the key state message and rather than change the label we added a nested dict, with the "e" label, but we also moved two other fields into "e" that didn't need to be moved. These are the "s" and the "d" field which are unique at the top level and are consistent with other messages that have them at the top level. The problem is that having a nested block for only one field the current event "t" field seemed out of place so two other fields were moved instead merely adding a new unique label. Now given my experienced implementing I think that was not the best choice.. After doing the implementation It became apparent to me that adding a new label would have been simpler less confusing and easier to implement and reason about.

The proposal is as follows:

remove the "e" block move the "s" and "d" fields from the "e" block to the top level add a new field at the top level with the label "te" field for message type of latest current event.

The proposed revised key state message looks like this.


{
  "v": "KERI10JSON00011c_",
  "i": "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",
  "s": "2",
  "t": "ksn",
  "d": "EAoTNZH3ULvaU6JR2nmwyYAfSVPzhzZ-i0d8JZS6b5CM",
  "te": "rot",
  "kt": "1",
  "k": ["DaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM"],
  "n": "EZ-i0d8JZAoTNZH3ULvaU6JR2nmwyYAfSVPzhzS6b5CM",
  "wt": "1",
  "w": ["DnmwyYAfSVPzhzS6b5CMZ-i0d8JZAoTNZH3ULvaU6JR2"],
  "c": ["eo"],
  "ee":
    {
      "s":  "1",
      "d":  "EAoTNZH3ULvaU6JR2nmwyYAfSVPzhzZ-i0d8JZS6b5CM",
      "wr": ["Dd8JZAoTNZH3ULvaU6JR2nmwyYAfSVPzhzS6b5CMZ-i0"],
      "wa": ["DnmwyYAfSVPzhzS6b5CMZ-i0d8JZAoTNZH3ULvaU6JR2"]
    },
  "di": "EJZAoTNZH3ULvYAfSVPzhzS6b5CMaU6JR2nmwyZ-i0d8",
  "a":
    {
      "i":  "EJZAoTNZH3ULvYAfSVPzhzS6b5aU6JR2nmwyZ-i0d8CM",
      "s":  "1",
      "d":  "EULvaU6JR2nmwyAoTNZH3YAfSVPzhzZ-i0d8JZS6b5CM"
    }
}

Instead of the current:


{
  "v": "KERI10JSON00011c_",
  "i": "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",
  "t": "ksn",
  "kt": "1",
  "k": ["DaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM"],
  "n": "EZ-i0d8JZAoTNZH3ULvaU6JR2nmwyYAfSVPzhzS6b5CM",
  "wt": "1",
  "w": ["DnmwyYAfSVPzhzS6b5CMZ-i0d8JZAoTNZH3ULvaU6JR2"],
  "c": ["eo"],
  "e":
    {
      "s": "2",
      "t": "rot",
      "d": "EAoTNZH3ULvaU6JR2nmwyYAfSVPzhzZ-i0d8JZS6b5CM",
    },
  "ee":
    {
      "s":  "1",
      "d":  "EAoTNZH3ULvaU6JR2nmwyYAfSVPzhzZ-i0d8JZS6b5CM",
      "wr": ["Dd8JZAoTNZH3ULvaU6JR2nmwyYAfSVPzhzS6b5CMZ-i0"],
      "wa": ["DnmwyYAfSVPzhzS6b5CMZ-i0d8JZAoTNZH3ULvaU6JR2"]
    },
  "di": "EJZAoTNZH3ULvYAfSVPzhzS6b5CMaU6JR2nmwyZ-i0d8",
  "a":
    {
      "i":  "EJZAoTNZH3ULvYAfSVPzhzS6b5aU6JR2nmwyZ-i0d8CM",
      "s":  "1",
      "d":  "EULvaU6JR2nmwyAoTNZH3YAfSVPzhzZ-i0d8JZS6b5CM"
    }
}
pfeairheller commented 3 years ago

In kerigo we implemented an internal representation of key state and used the existing Event struct as its representation, also adding a new field to it latestEvent that was itself an Event to capture the last event for the current state. We haven't gone as far as implementing key state as a message though.

Using that structure in Go made it easy to reason about but I can see the challenge of turning that into a concise message. For example, we were duplicating s and d at both the top level and in latestEvent.

I tend to agree that having those two fields at the top level of the ksn message is a "Good Thing" and for the sake of avoiding duplication we should remove them and then as a result flatten out e into the single value of the latest event type. My initial reaction was le for "latest event", but I never get too fussed over names and te works just as well.

SmithSamuelM commented 3 years ago

Eventually we will be able to further simplify the key state notification by removing the a field that contains an event seal when the endorser has a transferable identifier. This requires we add a new attachment group counter type that is a complex group consisting of a counter for one or more complex groups, each complex group starts with a triple (i,s,d) that serves as the seal followed by a counted group of indexed signatures. This is a ToDo item that was a low priority for replay because of how trans receipts are escrowed, and for key state as well because I can leverage existing receipting code to implement key state notification as is,but given that key state is still new code there is more motivation to implement the new complex attachment group sooner rather than later. This new group would also make replay more compact and eventually allow us to get rid of the VRC receipts (see the other related issue #123).

SmithSamuelM commented 3 years ago

After thinking about the anticipated effort to support key state notification requests I think its worth the effort to add the new complex attachment group.

The new attachment group is a counter with code -F## where ## is replaced by the two character Base64 count. The hard size (stable part) of the code is -F

TransIndexedSigGroups: str = '-F' # Composed Base64 Triple, pre+snu+dig+ControllerIdxSigs group.

Example Attachment of one groups ( annotated comments spaces and line feeds inserted for clarity

-FAB     # Trans Indexed Sig Groups counter code 1 following group
E_T2_p83_gRSuAYvGhqV3S0JzYEF2dIa-OCPLbIhBO7Y    # trans prefix of signer for sigs
-EAB0AAAAAAAAAAAAAAAAAAAAAAB    # sequence number of est event of signer's public keys for sigs
EwmQtlcszNoEIDfqD-Zih3N6o5B3humRKvBBln2juTEM      # digest of est event of signer's public keys for sigs
-AAD     # Controller Indexed Sigs counter code 3 following sigs
AA5267UlFg1jHee4Dauht77SzGl8WUC_0oimYG5If3SdIOSzWM8Qs9SFajAilQcozXJVnbkY5stG_K4NbKdNB4AQ         # sig 0
ABBgeqntZW3Gu4HL0h3odYz6LaZ_SMfmITL-Btoq_7OZFe3L16jmOe49Ur108wH7mnBaq2E_0U0N0c5vgrJtDpAQ    # sig 1
ACTD7NDX93ZGTkZBBuSeSGsAQ7u0hngpNTZTK_Um7rUZGnLRNJvo5oOnnC1J2iBQHuxoq8PyjdT3BHS2LiPrs2Cg  # sig 2

Given this attachment group we can simplify the ksn message by removing the seal for endorsers with transferable identifiers. The seal equivalent is provided in the pre+snu+dig or (i,s,d) triple in the front of the new complex attachment group.

The revised ksn appears as follows:

{
  "v": "KERI10JSON00011c_",
  "i": "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",
  "s": "2",
  "t": "ksn",
  "d": "EAoTNZH3ULvaU6JR2nmwyYAfSVPzhzZ-i0d8JZS6b5CM",
  "te": "rot",
  "kt": "1",
  "k": ["DaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM"],
  "n": "EZ-i0d8JZAoTNZH3ULvaU6JR2nmwyYAfSVPzhzS6b5CM",
  "wt": "1",
  "w": ["DnmwyYAfSVPzhzS6b5CMZ-i0d8JZAoTNZH3ULvaU6JR2"],
  "c": ["eo"],
  "ee":
    {
      "s":  "1",
      "d":  "EAoTNZH3ULvaU6JR2nmwyYAfSVPzhzZ-i0d8JZS6b5CM",
      "wr": ["Dd8JZAoTNZH3ULvaU6JR2nmwyYAfSVPzhzS6b5CMZ-i0"],
      "wa": ["DnmwyYAfSVPzhzS6b5CMZ-i0d8JZAoTNZH3ULvaU6JR2"]
    },
  "di": "EJZAoTNZH3ULvYAfSVPzhzS6b5CMaU6JR2nmwyZ-i0d8"
}
pfeairheller commented 3 years ago

Adding the new complex attachment group now and removing the a field now seems like the right choice.

I assume the demise of the VRC receipt will wait for another time?

SmithSamuelM commented 3 years ago

@pfeairheller

I assume the demise of the VRC receipt will wait for another time?

Yes it will break a lot of code and interoperability to kill VRC just now. It can wait until its more timely. But because the ksn is all new code it's timely to add the new attachment support for it now rather than later.

SmithSamuelM commented 3 years ago

Given what appears to be consensus I checked in the python code with the latest revised ksn and updated the code table.

If there are no objections I will update KID001 and KID003 to reflect.

chunningham commented 3 years ago

Strong agree with reorganizing the e block into the top level, its quite intuitive (the e block already was "top-level" info about the KEL). The only weirdness is the te but I think that's much much better than a nested dict just for a t. Is there actually a need for the event type te to be included if the digest and sn are also included (any verifier can look up the event and find the t, and if they can't then what's the difference)? The complex attachment group as a replacement for a I like, it imposes a more complicated suffix parsing logic but also it was kind of strange to have a included in the state message itself.

SmithSamuelM commented 3 years ago

I am not absolutely sure we need the te in the key state notice but I am not absolutely sure we don't either. The commitment an endorser makes to the d digest of the event seems to cover it in terms of security but maybe there is another reason to keep it. A hard requirement of looking up the event in the database to know what type of event it was may obviate some use cases of key state. But maybe we can delete it. Need to think about it some more.

SmithSamuelM commented 3 years ago

I believe that after some thought, that at least given our current set of event messages, that is, icp, rot, dip, drt, ixn, that we do not need the te field because we can unambiguously infer its value from the other fields in the ksn. Which means as long as we don't change the set of event messages we won't ever need it, but if we do and we can no longer unambiguously infer it and there are use cases that benefit from not having to look up or have the event itself in the database, then we may have to add it back.

Please confirm my logic.

We have s, d of the event  and ee.s and ee.d of the latest establishment event.
If d  != ee.d then the the event is not an establishment event and te == ixn
Else (d == ee.d) and the event is an establishment event
      if s ==0 then te == icp or dip
            If di is empty then te == icp  (not delegated)
            Else te == dip  (delegated)
      Else    te == rot or drt
             If di is empty then te == rot  (not delegated)
            Else te == drt  (delegated)
pfeairheller commented 3 years ago

That logic looks sound to me.

SmithSamuelM commented 3 years ago

In thinking about actually using key state notice when the event is delegated. Not having the full delegation location seal in the key state means that one MUST look up the delegated inception event in order to lookup the delegating event. But not sure that it buys us anything to have the seal without also doing lookups so likely just leave it be until we get more experience with real world delegation.

SmithSamuelM commented 3 years ago

With regard the specific label for the message type of the latest event for the ksn, in KERI compact labels we tried to find one letter labels for all the fields but were unsuccessful. We then used two letter labels, typically but adding a modifier letter to a value type letter. For me when I modify a value type in isolation I usually prefer to put the modifier after the value type so that the value type is emphasized visually. I was torn between te for message type of event and et for event message type. The former emphases that its a message type and that latter that it belongs to an event. However after looking more closely at how we have used two letter labels it appears that one consideration was that the modifier group two or more labels together. When grouping then it makes more sense to put the modifier before the value type so the visually they belong to the same group by starting with the same letter. This is true of w and wt and k and kt. The other place we sort of do the grouping is not for an actual group in one message but seems to be a potential group or a group of labels from different messages that all belong to the same group. This is true of da and di. Both never appear together in the same message but both use d as the first char modifier and the second character is the value type.

In the case of et vs te, there is no group, no other field is grouped with te/'et' but if one considers that there might be a potential group in the future or another message that might be in a cross message group, then the convention of putting the group modifier first means that et is more consistent than te. Given the the t in et was originally nested in the e labeled block then it feels like its a way of flattening with an abbreviation, that is, e.t becomes et when moved to the top label. As a result I revise my original field name from te to et.

SmithSamuelM commented 3 years ago

Revised Key State Notice with f, dt and p fields

One helpful approach to refining the ksn is to find a seminal use case and see if the ksn fields well satisfy that use case. This was motivated by trying to answer the question of do we need the et field in the key state at all. Although as reasoned above, given that there is only one non-establishment event message type we can infer the message type of the latest event, but thats not a future proof solution. Given that we inadvertently dropped the a field from the inception event (see issue #131) in an attempt to simplify but now have to incur the cost of adding it back because removing it lost us seminal class of use cases, I thought it worthwhile to review and think about seminal use cases for the key state notice message that would help make the decision. As a result of this review I did construct a seminal use case that also motivated adding three additional fields, namely the, f 'dt' and and p fields for first seen ordinal number and prior event digest respectively to the key state notice.

Seminal Use Case for Key State Notice, ksn

I could think of many use cases for the key state notice but not all of them are seminal and not all of them require all the fields. The primary use case among this class is just to bootstrap discovery which eventually results in downloading the KEL and then verifying key state from the KEL.

The seminal use case is when a validator is using one or more pools or sets of watchers (and witnesses) to determine if the key state for a given identifier is consistent across those pools. A Judge is a type of validator or is a service acting on behalf of validator that performs this role or function. So the seminal use case could be labeled the Judge use case but some Jurors may also act similarly as well as some Wallets. All are examples of Validators of indirect mode identifiers that need to do more than act as a simple watcher. These roles are similar when their validation logic depends on observing the key state of one or more pools of watchers (and in some cases witnesses). A primary use case is to reconcile duplicity, both duplicity on the part of witnesses and duplicity on the part of watchers by observing key state amongst the members of the pool or pools.

In order for a Judger to perform this task of evaluating consensus of key state from the global watcher network of KELs that its own watcher network have not yet download in order to determine which version to download. (so as not to pollute its own watchers ), the Judge must be able to unambiguously order key state notifications from global watchers. Any KEL that has experienced one or more signing key compromises with associated recovery rotations may result in any watcher producing key state notifications that a Judge may not be able to order with only the sequence number and digest in the key state notification message and without also having the associated KEL. This is because key state notifications may appear out of order asynchronously and a recovery forks the KEL so that more than one event may have the same sequence number. And multiple recoveries may each produce a different version of the event with the same sequence number. The sn and digest of the event alone are insufficient to determine ordering without the KEL itself. The simplest solution to this problem is to include the first seen ordinal number in the Key State notice. Then the judger may unambiguously order all key state notices from a given watcher by that watcher's first seen ordinal. The proposal is to use the f label for this field.

In the case of a recovery not all watchers will have the same first seen set of events. They may all eventually end up with the same post recovery set of authoritative events, but if a Judge wants to compare key state from multiple watchers using only the key state notices without downloading the actual KELs then including the prior event digest in the key state allows the Judge to see which watchers missed which disputed events and confirm their ordering. It is proposed to include the prior event digest in the p field. Finally for future proofing and convenience it is proposed to leave in the et field.

Moreover, one evaluation mechanism for reconciling duplicity is to look at the date-time that a given watcher first saw an event. Given a pool of pool of watchers that a Judge more or less trusts that are in consensus about the datetime of a given event then the Judger may use it to compare to another pool of global watchers with a different datetime for first seen. Assuming comparable levels of trust the earlier datetime is most likely to be the true authoritative datetime. New but honest watchers may have first seen a compromised version of an event and their KELs are thereby corrupted. Because key compromise usually happens some time later and any honest watcher will report later date times. It is proposed to include theis first seen datetime as the 'dt' field in the key state.

Recall that the seminal use case is to evaluate or appraise the consensus key state of the global watcher network prior to downloading the KEL in order to determine the best source or sources for downloading the KEL. This is a performance optimization which saves multiple requests to watchers when all the information is held in the key state notice. Hence making the key state a little bigger is better than having to make additional requests of events from the watchers KEL where each additional request will contain mostly duplicate information. So adding two fields to the key state will improve net performance given the need to unambiguously order key state notices by the Judger for its more-or-less trusted watchers from the global watcher network.

Revised Key State Notice Message

{
  "v": "KERI10JSON00011c_",
  "i": "EaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM",
  "s": "2",
  "t": "ksn",
  "p": "EYAfSVPzhzZ-i0d8JZS6b5CMAoTNZH3ULvaU6JR2nmwy",
  "d": "EAoTNZH3ULvaU6JR2nmwyYAfSVPzhzZ-i0d8JZS6b5CM",
  "f": "3",
  "dt": "2020-08-22T20:35:06.687702+00:00",
  "et": "rot",
  "kt": "1",
  "k": ["DaU6JR2nmwyZ-i0d8JZAoTNZH3ULvYAfSVPzhzS6b5CM"],
  "n": "EZ-i0d8JZAoTNZH3ULvaU6JR2nmwyYAfSVPzhzS6b5CM",
  "wt": "1",
  "w": ["DnmwyYAfSVPzhzS6b5CMZ-i0d8JZAoTNZH3ULvaU6JR2"],
  "c": ["eo"],
  "ee":
    {
      "s":  "1",
      "d":  "EAoTNZH3ULvaU6JR2nmwyYAfSVPzhzZ-i0d8JZS6b5CM",
      "wr": ["Dd8JZAoTNZH3ULvaU6JR2nmwyYAfSVPzhzS6b5CMZ-i0"],
      "wa": ["DnmwyYAfSVPzhzS6b5CMZ-i0d8JZAoTNZH3ULvaU6JR2"]
    },
  "di": "EJZAoTNZH3ULvYAfSVPzhzS6b5CMaU6JR2nmwyZ-i0d8"
}

I am not sure that the relative ordering of the new fields is the best so if there are good reasons to order them differently I am open to that. I tried to get them close to how they are ordered in other messages.

pfeairheller commented 3 years ago

@SmithSamuelM - You mention while discussing the first class of watchers (a validator's own watchers) that the key state notice is mostly a discovery mechanism. This mechanism is particularly important to the did:keri did method.

It was decided today to add to the did method specification language dictating to users of the did:keri did method that they must verify the key state notice they receive in the did resolution did document metadata themselves, using a KERI library.

Is it anticipated that the spec (in a KID) will define a specific method in KERI Core to use for key state verification?

Meaning will each KERI library be required to provide a single method that takes a key state notice, downloads and verifies the KEL for the specified identifier against the provided key state. This will be very useful for components that need to determine the validity of sources providing key state.

If the answer is "yes", I can create a straw-man issue.

SmithSamuelM commented 3 years ago

@pfeairheller

Short answer yes.

A longer answer: There are several mechanisms one may use for security:

One mechanism is a cryptographically verifiable data structure from a cryptographic root of trust as a source of truth. That is a KEL. So verifying key state given a KEL depends on the properties of the construction of that KEL. So for the did:keri method one can not alone use a key state notice in a secure way. One must trust in a strong way the endorser of the key state and that is not a verifiable trust. So the secure path is to download the key state and verify.

Another mechanism for security is called a threshold structure. A threshold structure uses multiple sources or truth, where each may have weaknesses but the aggregate set of sources of truth is strong. This is the reasoning behind multi-factor authentication, multi-signatures, and distributed consensus algorithms. So in a threshold structure sense given a set of endorsers of key state where it is reasonable to assume that most of the members of the endorser set are honest then one can infer to some degree of security the key state of that set given there is sufficient majority consensus of the key states from the whole set. That is the idea behind the second class of watchers in the description above. It is not enough to actually trust key state one still downloads and verifies the actual KEL but uses the consensus to make a decision about which KEL to download. If there is only one version of key state then there is no evidence of duplicity so all KELs are the same. If there is evidence of duplicity then examining the properties of the differences in versions of key state may be enough to determine which KEL is authoritative given one can more or less trust that a majority are honest.

So in the DID:KERI when we say download the KEL and verify it assumes one knows which version of the KEL to verify when there is duplicity. This is part of the KERI + Watcher network duplicity detection capability. It is up to the validator to use its watcher network to make that determination. But for the purposes of the did:method we instruct users of did:keri that the key state returned by did resolver did document meta-data may not be trusted and any key state must be verified against the authoritative KEL for that identifier. Its up to the validator and its watcher network to determine where to get the authoritative KEL. Its trivial if there is not evidence of duplicity, its harder when there is evidence of duplicity.