openwallet-foundation / credo-ts

Typescript framework for building decentralized identity and verifiable credential solutions
https://credo.js.org
Apache License 2.0
275 stars 202 forks source link

DIDCommMessaging Service Endpoint is in the wrong format #1789

Closed TheTechmage closed 5 months ago

TheTechmage commented 8 months ago

It was recently brought to my attention that the DIDComm services being generated from AFJ are either using the wrong service type (DIDCommMessaging vs did-communication) or that the service block isn't structured according to the spec of it's type. Upon closer inspection, I believe the situation to be the latter case.

It's been observed that when generating DIDs for did:peer:2 with a service endpoint type of DIDCommMessaging, the serviceEndpoint follows the format of the did-communication endpoint type instead of the DIDCommMessaging endpoint type.

Current service block format:

    {
      "id": "#service",
      "type": "DIDCommMessaging",
      "serviceEndpoint": "https://example.com",
      "recipientKeys": [
        "#key-2"
      ],
      "routingKeys": [],
      "priority": 0
    }

Expected service block format:

    {
      "id": "#service",
      "type": "DIDCommMessaging",
      "serviceEndpoint": {
        "uri": "https://example.com",
        "accept": [
          "didcomm/v2",
          "didcomm/aip2;env=rfc19"
        ],
        "routingKeys": []
      }
    }
TimoGlastra commented 8 months ago

The structure was changed in the DIDComm spec, but never in Credo. A pr is welcome!

genaris commented 8 months ago

This issue is noticeable only when rendering a did:peer:2's DID Document, right? (e.g. when we call agent.dids.resolve()) Does it affect anything in terms of agent interoperability?

If it's just a matter of did:peer:2 parsing I guess it will not be that hard to fix. In our model we currently have a DidCommV2Service whose type is actually DIDComm (which is consistent with another related issue in Sovrin specs, see https://github.com/sovrin-foundation/sovrin/issues/343), so it should be replaced by the newer one along with its corresponding format (BTW in our DIDComm V2 feature branch I see we considered this but the new format does not seem to be applied).

TheTechmage commented 8 months ago

The way did:peer:2 works is, each service block is packaged up and sent to the other party. So if an invalid service block is created for a did, then the other party will receive the same exact service block and may not be able to parse the service correctly. I have asked another person who is more knowledgeable as to what's happening to weigh in on the matter. I'm just the one of the team did experts.

genaris commented 8 months ago

The way did:peer:2 works is, each service block is packaged up and sent to the other party. So if an invalid service block is created for a did, then the other party will receive the same exact service block and may not be able to parse the service correctly. I have asked another person who is more knowledgeable as to what's happening to weigh in on the matter. I'm just the one of the team did experts.

Initially I thought the resulting service encoding within the did:peer:2 would be the same, but you are right: the resulting encoded service will not be compatible with other agents that have updated their code to the latest spec which, as this issue mentions, introduced breaking changes.

So I think it would be nice to address this before releasing 0.5.0. I have to check but IIRC this wouldn't affect interop with AFJ <= 0.4.2 (as we were using did:peer:1 and sharing DID Document with DIDComm V1 services).

KolbyRKunz commented 8 months ago

From what discussion me and @frostyfrog have had on this matter it seems it would be a good thing for any agent to ignore any service blocks that they do not recognize the format of, with that though it is on the other party to potentially include more service formats in the did (or didDoc in the case of did:peer:1).

Another option that would be cleaner but require more processing in the didExchange handlers would be to check the context array on the didExchange request message and send the services formatted for the older spec if the version is 1 or the updated service format if the version is 2, the draw back here would be that you would potentially have to have two dids associated with the agent that is responding.

The later option would help maintain compatibility with older versions of Credo. I will also note that currently even with did:peer:1 the service type to format is not inline with the updates, I have seen didDocs come with did:peer:1 that use the DidCommMessaging type but then use the did-communication formatting, but that seems to be a result of the spec getting updated. (I will test this again to validate that it is true).

KolbyRKunz commented 8 months ago

The issue with interop that we are facing is that any agent that develops against the updated spec that states the shape of didCommMessaging is different than what Credo sends will run into serialization issues when attempting to process the message or if they are unaware and no serialization issue occur then data will be lost making it impossible to communicate properly with Credo.

genaris commented 8 months ago

@KolbyRKunz could you elaborate a bit more in regards to compatibility issues with did:peer:1? In DID Exchange, when using did:peer:1 we are sharing DID Documents that contain DIDComm services with type did-communication. If I'm not wrong that's also the case for did:peer:4 DID Exchanges.

The only conflicting case I can think of is DID Exchanges using did:peer:2, since any did-communication services will be encoded as dm which on the parser side it will be interpreted as DIDCommMessaging and therefore, as you pointed out, will be incompatible with current state of the spec.

So a quick fix for this (yet compliant with specs) would be Credo to not abbreviate did-communication, DIDComm and IndyAgent as dm anymore. If the receiving agent is compatible with DIDComm V1 (Aries RFC 0067), receiving agent should be capable of decoding it. Otherwise, as you say, it might ignore it and will not be able to communicate with that DID.

A better approach of course would be to use the new DIDCommMessaging service block for all DID Exchange interactions using did:peer:2 and did:peer:4. This would not represent any backwards compatibility issue because AFJ <= 0.4.2 does not support these DID Methods for DID Exchange, so if we keep using did-communication services for did:peer:1 DID Exchange we should be OK (and it is likely that newer agents will stop using it anytime soon in favour of did:peer:4).

KolbyRKunz commented 8 months ago

@genaris You are correct with the did:peer:1. I thought when doing testing that I saw the did:peer:1 document had the same issue but I was wrong.

Sorry for the confusion. What you are suggesting sounds like a reasonable fix to me. Thank you.

genaris commented 8 months ago

@genaris You are correct with the did:peer:1. I thought when doing testing that I saw the did:peer:1 document had the same issue but I was wrong.

Sorry for the confusion. What you are suggesting sounds like a reasonable fix to me. Thank you.

Well good to know that it didn't happen there!

I was inspecting a bit the code and running the DID Exchange numalgo tests and when exchanging using did:peer:2 I see these DIDs exchanged between our friends Alice the Faber:

Inspecting their services token and base64-decoding them I see:

So it looks like Credo is properly encoding the DIDComm V1 service, which makes me think that the 'quick fix' I previously mentioned is not needed.

But we will still need to fix the reception of did:peer:2, since if we receive a dm service type we are not going to properly interpret it and therefore we'll be unable to send messages to that peer.

conanoc commented 8 months ago

But we will still need to fix the reception of did:peer:2, since if we receive a dm service type we are not going to properly interpret it and therefore we'll be unable to send messages to that peer.

Right. I got this error when I sent a did:peer:2 to a Credo agent:

Unable to resolve did 'did:peer:2.Ez6LSsrvkiTBjFa6yXioyu6uDqT2JNDXhkVH5sa6M4Q1aDFX6.Vz6Mkve1dnPdJVeshYqGv42M7MxMpLeGSTmMHgcHbQDL4R6Xi.SeyJzIjp7InIiOlsiZGlkOmtleTp6Nk1rbTJqUlJTaHpKQzlSR0hpZUZ0S0FqN0pZZjU2ZGFKckhDZTJnVXIxdlFNU0EiXSwidXJpIjoiaHR0cDovL2xvY2FsaG9zdDozMDAxIn0sInQiOiJkbSJ9': 
ClassValidationError: DidDocumentService: Failed to validate class.
An instance of DidDocumentService has failed the validation:
 - property serviceEndpoint has failed the following constraints: serviceEndpoint must be a string

So it looks like Credo is properly encoding the DIDComm V1 service, which makes me think that the 'quick fix' I previously mentioned is not needed.

You are partly right. Other peer-did libraries would ignore other fields than t and s. And the agents using those libraries will not be able to get routingKeys.

genaris commented 8 months ago

You are partly right. Other peer-did libraries would ignore other fields than t and s. And the agents using those libraries will not be able to get routingKeys.

And why is that? As of today, the spec still mentions "r" as one of the possible abbreviated terms: https://identity.foundation/peer-did-method-spec/#common-string-abbreviations

conanoc commented 8 months ago

"r" is parsed when it is inside "s". Check out the below example in the spec:

{
  "t": "dm",
  "s": {
    "uri": "http://example.com/didcomm",
    "a": [
      "didcomm/v2"
    ],
    "r": [
      "did:example:123456789abcdefghi#key-1"
    ]
  }
}
genaris commented 8 months ago

"r" is parsed when it is inside "s". Check out the below example in the spec:

{
  "t": "dm",
  "s": {
    "uri": "http://example.com/didcomm",
    "a": [
      "didcomm/v2"
    ],
    "r": [
      "did:example:123456789abcdefghi#key-1"
    ]
  }
}

Well, I think that's just an example. In the same section, the spec states that to encode a service we should "recursively replace common strings in key names and type value with abbreviations from the abbreviations table". I don't see any indication saying that some of those values are only valid within a certain map structure. I tried to find what changed in https://github.com/decentralized-identity/peer-did-method-spec/pull/62 in regards to this, and I see in the comments from @dbluhm that he "added clarification that common string abbreviation should recursively descend through the service object.", which is somewhat consistent with my reasoning.

IMHO, if some libraries are ignoring those abbreviated keys because they are basing their implementation only on the examples shown in the specs, they are not properly doing it.

Can you mention the libraries you are referring to? Are they supporting the old did-communication and Indy agent services used in DIDComm V1 or they went straight to DIDComm V2?

conanoc commented 8 months ago

I think abbreviation is not the issue. The problem is the structure of the "service" object. According to the spec, a service object has only three fields, id, type and serviceEndpoint. Other fields in the service object will be ignored or throw an error depending on the implementation.

I'm currently testing with https://github.com/beatt83/peerdid-swift, ~and https://github.com/sicpa-dlab/peer-did-jvm will behave the same.~

genaris commented 7 months ago

https://github.com/sicpa-dlab/peer-did-jvm

My understanding from did-core spec is that the service object must have those three fields but may define other extra fields:

Each service map MUST contain id, type, and serviceEndpoint properties. Each service extension MAY include additional properties and MAY further restrict the properties associated with the extension.

Anyway, I think this is something we cannot modify at this point, since those old service types do rely on those fields. Surely we'll all agree as soon as we properly support this new DIDCommMessaging endpoint (hopefully soon!!).