Determining the type of xReg entities

duglin commented 2 months ago

There may be times when a client/tooling may be provided a URL to an xRegistry entity without a clear indication of what it is pointing to. For example, a codegen tool that processes Message definitions might take a URL to a Message definition group, a single Message definition Resource or a single Message definition Version. Without extra metadata (or guessing via inspect of the JSON returned from a GET to that URL), it may be hard to determine what the entity actually is.

Another example, in other chats we've talked about how it's important that a client that wants a URL to a Resource should be able to (for the most part) accept a URL to a Resource or one of its Versions and still process the entity properly - ignoring any Resource or Version specific attributes/differences. Meaning, the core metadata of interest should look the same in both cases. However, a client may still wish to know if the entity is a Resource or a Version. Today we can know this via the presence of certain attributes like versionsurl for a Resource or isdefault for a Version. While this can work, it's a bit hacky and the exact properties to look for at each level in the xReg hierarchy are different - requiring extra thinking/work by tool authors.

We should consider a more consistent/deterministic way for a client to know what it got back from a GET.

Some options:

a common field in all xReg entities (e.g. type)
a well-defined content-type for each entity

I think a field in each entity might be the best option so that it's self-describing and as the JSON is passed-around in the code they don't need to find a way to include this "extra" metadata along side it.

Proposals:

Add a new field to each level of the xReg called type that shows not just what the entity is, but what its parent's are
Value will be of the form: [GROUP][/RESOURCE[/version]]]. Singular names.

Examples:

`` -> empty, the Registry itself
schemagroup -> a single Group instance
schemagroup/schema -> a single Resource instance
schemagroup/schema/version -> a single Version instance

They can then tell what level of the xReg hierarchy they're at by counting slashes.

Another option is to include the actual IDs of each layer too, so basically the self URL w/o the Registry's baseURL. While they could just use the self URL itself, it would require the client to know what the baseURL is and that could be a challenge if all they're provided is the JSON of the entity. I think it's possible for it to be ambiguous to know how to strip-off the baseURL accurately - take an extreme example of a baseURL of: http://example.com/schemagroups where then the actual schemaGroups URL would be: http://example.com/schemagroups/schemagroups. W/o advanced knowledge of the baseURL things are ambiguous

Example:

``
schemagroups/blobstore
schemagroups/blobstore/schemas/blobcreated
schemagroups/blobstore/schemas/blobcreated/versions/v1.0

This tells them not only what they're looking at (by counting slashes), but the actual IDs of its parents - which could be useful w/o parsing the self URL.

I'd probably call this something like path instead of type

I'm leaning towards the 2nd option (path).

duglin commented 1 month ago

On the 7/16 call we decided to merge this issue with @Fannon's idea of an ID that's the entity's path. That would result in this:

{
  "id": "entityID",
  "xid": "entityPath"
}

for example:

{
  "id": "rID",
  "xid": "GROUPs/gID/RESOURCEs/rID"
}

This allows us to continue to have "id" be scoped (semantically and syntactically) to its owner/parent/context w/o duplication of info (ie. the full path). But then we have a 2nd "id" called xid that is that full path (types and ids of the hierarchy) AND it would then also be the xref value if someone wanted to point to this Resource. Since we like the idea of this being the xref value that would be used, it needs to have the same constraints as the xref, meaning it MUST NOT start with a /.

deissnerk commented 1 month ago

Does this imply that the gID is globally unique? Otherwise an xid would only be unique within one registry. When replicating groups across different xRegistry instances, there could always be collisions. A globally unique xid would need an authority or namespace element. When looking into our samples, this is often simply added to the gID, usually in form of a namespace in reverse DNS notation. I would prefer an explicit authority or namespace element.

duglin commented 1 month ago

The initial thought was that xid would be local to the Registry so it could easily be used in an xref. We really haven't taken cross-registry replication into account yet. Do you have anything written down about how replication would work from a user's perspective? For example, when Reg-A is replicated into Reg-B, which Registry knows about it? How is the re-sync initiated - from the source or destination? I kind of assumed that the origin property would play a role in that.

deissnerk commented 1 month ago

I haven't defined in detail how replication would work, but I assume that it will happen. xref only works inside a single registry, and I think we had good reasons for that. But as a consequence, you need to replicate everything you want to re-use into your own registry. origin seems to be focused on providing the original location of a resource, but that doesn't always help to uniquely identify the resource.

Examples:

In xRegistry CloudEvents, an identifier could be used as subject. An event of type io.xregistry. messages.MessageChanged could point to xreg://conference.example/messagegroups/management/messages/ConferencePlanned
Expressing a relation to another model entity without "include" semantics where it is more important to find all resources referring to a specific entity.
At some point, we might have additional protocols for xRegistry. We might have a GRAPHQL or even an AMQP API. Basing identification of model resources on http URLs would then feel strange.
A stable reference to an xRegistry entity might also be helpful outside xRegistry, e.g. as a label on a K8s resource or in a helm chart.

Therefore I am thinking about an xRegistry-specific URI scheme where we could clearly define the syntax. We could define it in a way that our xid would be a relative URI reference to it. Understanding the scheme of a URI makes it easy to deterministically compare two URIs and to define mappings to different protocols. We could define how to create an xRegistry REST URL or an AMQP message from an xreg URI.

xregistry / spec

Determining the type of xReg entities #137