Services could contain multiple types

peacekeeper commented 1 year ago

This line: https://github.com/input-output-hk/prism-did-method-spec/blob/9b63d9f75cfdce2840972a214a15efafacf21165/w3c-spec/PRISM-method.md?plain=1#L276

Could potentially be changed to repeated string, since according to DID Core a service can have multiple types (although I haven't seen this in practice).

EzequielPostan commented 1 year ago

this is related to the topic in #17

It may be good to reconsider services altogether and support the full expressiveness from DID core.

Maybe for both services types and serviceEndpoints we could store unstructured data. Basically a JSON string that contains the string, set, or map as described in W3C specs. This would also give the flexibility to users to say when they want to display a "string" vs "a set with a single string". The drawback is, we would store unstructured data. Prostgres supports the json type though. @shotexa @patlo-iog @FabioPinheiro , any thoughts on this idea?

FabioPinheiro commented 1 year ago

Instead of having any Json or what we have right now

message AddServiceAction {
    Service service = 1;
}

message Service {
    string id = 1;
    string type = 2;
    repeated string service_endpoint = 3;
    LedgerData added_on = 4; // (only present in DID resolution) The ledger details related to the event that added the servicein DID resolution
    LedgerData deleted_on = 5; // (only present in DID resolution) The ledger details related to the event that revoked the service.
}

I would suggest something with more structure on the protocol buffer side. Like:

message AddServiceAction {
    repeated oneof service {
        ServiceDIDCommMessaging didcomm = 1;
        ServiceLinkedDomains linked_domains = 2;
        ServiceGeneric generic = 3;
    };
 }
message ServiceGeneric { // Works for all the cases but is not recommended!
    string id = 1;
    string type = 2;
    string data = 3; // any Json
}

message ServiceDIDCommMessaging { // Assume  type = DIDCommMessaging
    string id = 1;
    repeated string service_endpoint = 2;
    repeated string accept = 3;
    repeated string routingKeys = 4;
}

message ServiceLinkedDomains {  // Assume  type = LinkedDomains
     string id = 1;
     oneof service_endpoint {
         repeated string endpoints = 2;
         repeated string origins = 3;
     }
}

Has more structure, it's easy to validate, it's easier to be expanded. When extended and you are behind you know exactly what you don't know. Protocol buffer can do optimizations when encoding (comparing with generic json).

peacekeeper commented 1 year ago

I don't have much experience with protocol buffers, so I can't really comment on whether or not it's a good idea to store unstructured data. My guess is that one downside would be that it makes it harder to query, e.g. to look up a specific service by type. But +1 to supporting the full expressiveness from DID Core.

patlo-iog commented 1 year ago

I'm on this one with @FabioPinheiro . This makes it explicit what structure service it must adhere to while still open for extension. Using raw JSON seems a bit unsafe from implementation point of view. A lot of things can go wrong and we must state the structure in the spec anyway if we want interoperability with other implementation.

EzequielPostan commented 1 year ago

Thank you for the comments

I also prefer structured data whenever possible. However, I will throw some possible scenarios:

when we have a repeated X and the spec allows both having X and {X} in the JSON-LD representation, how to distinguish the cases?
if the models for DIDComm and LinkedDomain change, should we create multiple instances of these messages so that people can opt which version one to use?
If the evolution of the models we support as protobuf is faster than our node updates, won't users just encode them as JSON in the generic model? If so, won't we gradually lose some intended consistency of the data we store? i.e. some of the data may be stored structured, while other parts not.
Furthermore, won't people use things before we create models for them (e.g. a new application protocol we are not following)? If so, it looks likely that nodes will store related services as JSON even if we later create the specific protobuf messages. Even a migration after we add the new messages, won't stop more users to keep sending the JSON models from their apps

To some extent, in my usual tone of voice I should ask, "if we take a step back, what are we trying to solve?" If it is the flexibility problem, does the instantiation of specific models add value? could we articulate the value we expect to achieve, and evaluate if we will achieve it given some scenarios like the ones I descried above?

Note: I don't like storing unstructured data, but I can see that if the value is not guaranteed, then aiming to store it in a structured manner is adding extra work than storing a plain JSON

patlo-iog commented 1 year ago

when we have a repeated X and the spec allows both having X and {X} in the JSON-LD representation, how to distinguish the cases?

Not sure if I understand this correctly. X or {X} (list of X?) is at the representation level, so if we use {X} at the protobuf level then we can always translate to/from {X} or X in some special case.

if the models for DIDComm and LinkedDomain change, should we create multiple instances of these messages so that people can opt which version one to use?

Good point. I think it depends on the changes to the spec. There are risks that the change is JSON backward compatible but not protobuf backward compatible. In that case the new model might need to be added to oneof.

The other 2 points are about tradeoffs. From my perspective, I would be tempted to go all-in for the structured approach and not allow raw JSON to creep in. (disclaimer: This is an opinion from someone who will handle the implementation :smiley:. So take that with a grain of salt.)

EzequielPostan commented 1 year ago

Not sure if I understand this correctly. X or {X} (list of X?) is at the representation level, so if we use {X} at the protobuf level then we can always translate to/from {X} or X in some special case.

the question is how you specify in the DID method spec when to select each special case. This is, the resolver doesn't know what the user will use the returned DID Doc for, so it cannot decide if translating a list of just X to the plain element or the actual list For instance, another place where this occurs is on services' type (the initial topic of this issue). Having repeated string type is actually bringing the question of "how to tell when the resolver should return X or [ X ]?", and having unstructured data (either the JSON strings "X" or a ["X"]) let the user decide this at the time of creating the operation

To some extent, the general issue I notice is that we are trying to validate at DID method level, things that to me look application level related. This is, we are trying to validate DIDComm formats (or future application level formats) at the level of protocol operations and, at the same time, we want to allow the extensibility that W3C allows. I see it unfeasible that we will be able to control validations too much if we allow a generic model. And I kinda think that allowing flexible services would be good. This is why I wanted to articulate the value we would get by structuring the data given that the standards are so flexible with unstructured data

patlo-iog commented 1 year ago

Hmmm, on second thought, that actually starts to make some sense. From your point of view, it would be pretty similar to HTTP vs JSON relationship which would justify the unstructured data. I am a bit opinionated and would prefer structured data, but I do see the value of what you mean. It'll come down to the objective of Prism DID and what it is trying to solve.

EzequielPostan commented 1 year ago

It'll come down to the objective of Prism DID and what it is trying to solve.

I agree, when we added services, the idea was to support DID comm and other potential needs. We specified and implemented a subset of the spec that we thought would be reasonable. We then found out that DID comm needed something else. We are now starting a similar conversation about supporting more key types.

In terms of adoption, we should try to be flexible. For keys, we are not rushing the extensibility because we haven't reviewed the standards that exists for encoding them. But, for the case of services, there are no standards, which makes it "easier" to add the generic case.

I guess we can ask @lohanspies for orientation about what we need to support

peacekeeper commented 1 year ago

we are trying to validate at DID method level, things that to me look application level related

I was just reading up on this thread and thinking exactly the same thing. The structure on the DID method level should follow what's in DID Core, but not what's required by certain applications. So it might be best to avoid anything DIDComm- or LinkedDomain- specific in the protobuf definitions.

Unless maybe for marketing reasons you want did:prism to appear tightly coupled with DIDComm or DWNs or something else, but I think that's not the case.

I also agree there's a difference between keys and services. There is a relatively low number of key types, and they affect the DID method itself. Services on the other hand are more easily extensible and really have nothing to do with the DID method.

peacekeeper commented 1 year ago

Having repeated string type is actually bringing the question of "how to tell when the resolver should return X or [ X ]?"

The rule could be to return X if the count is 1, and return [ X ] if the count is > 1. Yes you would lose a bit of expressiveness, since you wouldn't be able to return a single item in an array. But I have never seen an array of service types in a DID document, let alone an array of service types containing only a single type. So this might be acceptable.

goncalo-frade-iohk commented 1 year ago

@patlo-iog @FabioPinheiro although normally I like to have more structured data, I think this approach would in the end bite us in the ass, just for the sakes of maintainability and versioning as @EzequielPostan mentioned.

Having a structured data as proposed by @FabioPinheiro would mean that any changes on for example the DIDComm service protocol, would require changes on 2 places both the method and application levels to support the new protocol. That would as well require a new version of the method. And while this can have its advantages it would require "us" to make decisions and be on top of didcomm improvements in their method so we can support it.

While if we keep the service "open" for a json, and not have structured data it would only require changes at application level only. I think this is more advantageous for us and for the method.

So I would actually prefer something more open like: (I added repeated on type since by w3c standards it can be an array)

message Service {
    string id = 1;
    repeated string type = 2;
    repeated string service_endpoint = 3;
    LedgerData added_on = 4; // (only present in DID resolution) The ledger details related to the event that added the servicein DID resolution
    LedgerData deleted_on = 5; // (only present in DID resolution) The ledger details related to the event that revoked the service.
}

FabioPinheiro commented 1 year ago

But in that case, we can start discussing the same point in many other places on the did document other than the Service. At this point, I just want to pick one solution and live with it.

EzequielPostan commented 1 year ago

we can start discussing the same point in many other places on the did document other than the Service

I agree, and we will likely have to in a near future We already raised two related issues to this one. Namely, how to make our keys more flexible, and how to allow the user to express a desired JSON-LD context

At this point, I just want to pick one solution and live with it.

I will update the spec today. I will likely use @goncalo-frade-iohk's model, or something similar in case I have to change it

thank you all for the comments and suggestions

input-output-hk / prism-did-method-spec

Services could contain multiple types #8