hyperledger / aries-vcx

aries-vcx is set of crates to work with DIDs, DID Documents, DIDComm, Verifiable Credentials and Hyperledger Aries.
https://didcomm.org
Apache License 2.0
125 stars 83 forks source link

New serialization formats #745

Closed Patrik-Stas closed 1 year ago

Patrik-Stas commented 1 year ago

We are currently applying new approach for implementing state machines in https://github.com/hyperledger/aries-vcx/pull/743. This PR is performing breaking changes on Connection object which only has been added recently (as opposed to the original MediatedConnection). The young age of Connection was rationale why we were able to simply rewrite the implementation.

This also allows us to rething serialization formats which with the original state machine are far from ideal. A testament for that is the need for aries-vcx consumers to come up with their own, better, serialization formats as indicated here https://github.com/hyperledger/aries-vcx/issues/742

The current problems are:

Unnecessary top json root

The current state

Example: serialized Verifier state machine

{
    "verifier_sm": {
      "source_id": "1",
      "thread_id": "0a794e24-c6b4-46bc-93b6-642b6dc98c98",
      "state": { ... } 
}

Example: serialized Issuer state machine

{
        "issuer_sm": {
            "state": { ... },
            "source_id": "alice_degree",
            "thread_id": "0a794e24-c6b4-46bc-93b6-642b6dc98c98"
        }
    }

Proposal

Because the top level node is varying between state machines, this makes it harder to work with, as the "json access path" is always different. The serialized state machine should encode what protocol it represents, but in more optimal way. Suggestion would be to instead encode that as follows:

{
    "protocol_role": "issuer",
    "protocol":  "issue-credential/1.0"
    "state": { ... },
    "source_id": "alice_degree",
    "thread_id": "0a794e24-c6b4-46bc-93b6-642b6dc98c98"
}   

where:

Difficult access to state variant

The current state

Example: serialized Verifier state machine

{
        "issuer_sm": {
            "state": {
                "Finished": {
                    "cred_id": null,
                    "thread_id": "cb54e2f9-ee17-488c-9bc7-d70c29cff802",
                    "revocation_info_v1": {
                        "cred_rev_id": "1",
                        "rev_reg_id": "V4SGRU86Z58d6TV7PBUe6f:4:V4SGRU86Z58d6TV7PBUe6f:3:CL:67:tag1:CL_ACCUM:tag1",
                        "tails_file": "/tmp/tails"
                    },
                    "status": "Success"
                }
            },
            "source_id": "alice_degree",
            "thread_id": "0a794e24-c6b4-46bc-93b6-642b6dc98c98"
        }
    }

The current protocol state variant is currently encoded as JSON key inside state attribute. This makes it awkward to access, as you need to read key of state (issuer_sm.state.???)

The current state

It would be easier to read protocol state if we instead can rely on a particular static JSON path, such as issuer_sm.protocol_state - hence the example above would look instead like:

{
        "issuer_sm": {
            "state": {
                "state": "finished" 
                "data": {
                    "cred_id": null,
                    "thread_id": "cb54e2f9-ee17-488c-9bc7-d70c29cff802",
                    "revocation_info_v1": {
                        "cred_rev_id": "1",
                        "rev_reg_id": "V4SGRU86Z58d6TV7PBUe6f:4:V4SGRU86Z58d6TV7PBUe6f:3:CL:67:tag1:CL_ACCUM:tag1",
                        "tails_file": "/tmp/tails"
                    },
                    "status": "Success"
                }
            },
            "source_id": "alice_degree",
            "thread_id": "0a794e24-c6b4-46bc-93b6-642b6dc98c98"
        }
    }

Is source_id useful?

I am not sure if this is useful for consumers. I think to keep aries-vcx minimal, identification of state machines should not be responsibility of aries-vcx, but application code should handle that instead (when storing state machines in storage, etc).

Combined proposal

Combining the changes above, here's combined example of all changes. Given state machine

{
    "issuer_sm": {
        "state": {
            "Finished": {
                "cred_id": null,
                "thread_id": "cb54e2f9-ee17-488c-9bc7-d70c29cff802",
                "revocation_info_v1": {
                    "cred_rev_id": "1",
                    "rev_reg_id": "V4SGRU86Z58d6TV7PBUe6f:4:V4SGRU86Z58d6TV7PBUe6f:3:CL:67:tag1:CL_ACCUM:tag1",
                    "tails_file": "/tmp/tails"
                },
                "status": "Success"
            }
        },
        "source_id": "alice_degree",
        "thread_id": "0a794e24-c6b4-46bc-93b6-642b6dc98c98"
    }
}

it would now become:

{
    "thread_id": "0a794e24-c6b4-46bc-93b6-642b6dc98c98"
    "protocol_role": "issuer",
    "protocol":  "issue-credential/1.0"
    "state": "finished",
    "state_data": {
        "cred_id": null,
        "thread_id": "cb54e2f9-ee17-488c-9bc7-d70c29cff802",
        "revocation_info_v1": {
            "cred_rev_id": "1",
            "rev_reg_id": "V4SGRU86Z58d6TV7PBUe6f:4:V4SGRU86Z58d6TV7PBUe6f:3:CL:67:tag1:CL_ACCUM:tag1",
            "tails_file": "/tmp/tails"
        },
        "status": "Success"
    }
}

which includes changes:

gmulhearn commented 1 year ago

Nice, thanks for this Patrik. Perhaps the only nuance (at least that I know of) of source_id is that in Connections, the source_id is used as the label that is sent to the peer you're connecting with. So if it's removed, something else will need to take it's place within the Connection handlers

Patrik-Stas commented 1 year ago

Well, following up last aries-vcx community call, we agreed the users should not rely on what serialization format looks like, hence there's not much value in trying to make it nicer or better structured.

We can still discuss removal of source_id, but I'll leave that for separate issue.