Thoughts on Data Models, Interoperability, and Querying Trust Registries

mathieuglaude commented 3 months ago

When exploring the concept of trust registries, it becomes evident that the data models for each registry vary significantly. For instance, the data model used by the DIACC PCTF is markedly different from those of the C2PA, and similarly, the models for issuer and verifier registries, as well as membership registries, all present their unique structures.

This diversity in data models reflects the nature of trust as an inherently multifaceted concept. Each trust registry serves a distinct purpose and, therefore, can necessitate a specific type of data model. This leads to the realization that, while the registries are diverse, there is a need to categorize them into types. Just as we categorize professionals like plumbers, engineers, and doctors in a directory, trust registries could benefit from a similar classification system.

However, a challenge arises when we consider the global scope of these registries. A plumbers list in Ontario might have a different data format compared to one in Japan, which is acceptable. The critical question is whether we can acknowledge the existence of such a list, regardless of its format.

The concept of 'type' becomes crucial here. Each type should have an associated data model and a defined source of truth. This framework then raises the question of how to formulate queries across these diverse systems in a standardized manner.

For example, querying whether a person is authorized to issue a credential requires a different approach than inquiring if a company is a member of a mining association or if they are permitted to mine coal. These distinct queries highlight the need for a method to construct and ask questions effectively.

One potential solution involves leveraging artificial intelligence and generative techniques, along with expertise in regular expressions, to craft these queries. This approach is akin to using predicate proof requests, where the ability to ask a specific question, such as verifying if someone's age is over 18, is predicated on knowing the correct credential. Drawing a parallel, knowing the 'type' of registry and the data model it employs allows for precise and meaningful queries.

Note: I realize that this topic may be better suited in 'Discussions', however I think it impacts how the protocol should be designed. I can move it to there @darrellodonnell if you think its better suited for there..

darrellodonnell commented 3 months ago

I agree that data models vary wildly. I submit that the (simplistic) use of "namespaces" allows us to begin normalizing the conversation.

If you have multiple countries looking at a range of different schema (e.g. driver license) you can create a namespace to help normalize - but you'll need to consider equivalencies and purpose to do that. As an example - the International Driver License normalizes a driver license to fit a "standard" set of attributes that answer the question "Is Bob licensed to drive in the US?" so decisions can be made elsewhere.

But we also have "treaties" that link states to states and states to provinces for shared purposed (e.g. get a fine in some states and you will see demerit points on your Ontario driver license" - and those are on a case by base basis, established jurisdiction-to-jurisdiction...

We could easily have multiple namespaces supported by any data model:

Does Bob have a idl:equivalentdriverlicense under Global DL Framework?
Does Bob have a ontario:driverlicense under NY-ON-demerit-framework? Helps a NY police system connect the driver to a treaty-enabled system.

The work of namespacing (agreeing on a single term) is, IMO, easier than attempting to normalize full systems when you're looking at equivalency mapping for a discrete purpose.

For example, querying whether a person is authorized to issue a credential requires a different approach than inquiring if a company is a member of a mining association or if they are permitted to mine coal. These distinct queries highlight the need for a method to construct and ask questions effectively.

Do you see a single TR that answers these two very different questions:

is EntityX a member of a mining association?
is EntityX permitted to mine coal?

Those are very different questions and to ask them without domain knowledge seems odd to me.

mathieuglaude commented 3 months ago

I think namespaces could be a good idea. But depending on the type of trust registry, it can answer multiple (bounded) questions specific to that type. A TR with authorized issuers may answer one or two questions. But a membership list type may answer 4-5.

So depending on the type, they may answer to a set of specific questions. So I guess certain questions must be agreed upon based on the type. But there may be multiple questions asked to one TR.

darrellodonnell commented 3 months ago

Exactly, and each ecosystem will look different but they will "rhyme".

trustoverip / tswg-trust-registry-protocol

Thoughts on Data Models, Interoperability, and Querying Trust Registries #21