xregistry / spec

xRegistry related specifications
https://xRegistry.io
Apache License 2.0
30 stars 6 forks source link

Determining the type of xReg entities #137

Open duglin opened 2 months ago

duglin commented 2 months ago

There may be times when a client/tooling may be provided a URL to an xRegistry entity without a clear indication of what it is pointing to. For example, a codegen tool that processes Message definitions might take a URL to a Message definition group, a single Message definition Resource or a single Message definition Version. Without extra metadata (or guessing via inspect of the JSON returned from a GET to that URL), it may be hard to determine what the entity actually is.

Another example, in other chats we've talked about how it's important that a client that wants a URL to a Resource should be able to (for the most part) accept a URL to a Resource or one of its Versions and still process the entity properly - ignoring any Resource or Version specific attributes/differences. Meaning, the core metadata of interest should look the same in both cases. However, a client may still wish to know if the entity is a Resource or a Version. Today we can know this via the presence of certain attributes like versionsurl for a Resource or isdefault for a Version. While this can work, it's a bit hacky and the exact properties to look for at each level in the xReg hierarchy are different - requiring extra thinking/work by tool authors.

We should consider a more consistent/deterministic way for a client to know what it got back from a GET.

Some options:

I think a field in each entity might be the best option so that it's self-describing and as the JSON is passed-around in the code they don't need to find a way to include this "extra" metadata along side it.

Proposals:

Examples:

They can then tell what level of the xReg hierarchy they're at by counting slashes.

Another option is to include the actual IDs of each layer too, so basically the self URL w/o the Registry's baseURL. While they could just use the self URL itself, it would require the client to know what the baseURL is and that could be a challenge if all they're provided is the JSON of the entity. I think it's possible for it to be ambiguous to know how to strip-off the baseURL accurately - take an extreme example of a baseURL of: http://example.com/schemagroups where then the actual schemaGroups URL would be: http://example.com/schemagroups/schemagroups. W/o advanced knowledge of the baseURL things are ambiguous

Example:

This tells them not only what they're looking at (by counting slashes), but the actual IDs of its parents - which could be useful w/o parsing the self URL.

I'd probably call this something like path instead of type

I'm leaning towards the 2nd option (path).

duglin commented 1 month ago

On the 7/16 call we decided to merge this issue with @Fannon's idea of an ID that's the entity's path. That would result in this:

{
  "id": "entityID",
  "xid": "entityPath"
}

for example:

{
  "id": "rID",
  "xid": "GROUPs/gID/RESOURCEs/rID"
}

This allows us to continue to have "id" be scoped (semantically and syntactically) to its owner/parent/context w/o duplication of info (ie. the full path). But then we have a 2nd "id" called xid that is that full path (types and ids of the hierarchy) AND it would then also be the xref value if someone wanted to point to this Resource. Since we like the idea of this being the xref value that would be used, it needs to have the same constraints as the xref, meaning it MUST NOT start with a /.

deissnerk commented 1 month ago

Does this imply that the gID is globally unique? Otherwise an xid would only be unique within one registry. When replicating groups across different xRegistry instances, there could always be collisions. A globally unique xid would need an authority or namespace element. When looking into our samples, this is often simply added to the gID, usually in form of a namespace in reverse DNS notation. I would prefer an explicit authority or namespace element.

duglin commented 1 month ago

The initial thought was that xid would be local to the Registry so it could easily be used in an xref. We really haven't taken cross-registry replication into account yet. Do you have anything written down about how replication would work from a user's perspective? For example, when Reg-A is replicated into Reg-B, which Registry knows about it? How is the re-sync initiated - from the source or destination? I kind of assumed that the origin property would play a role in that.

deissnerk commented 1 month ago

I haven't defined in detail how replication would work, but I assume that it will happen. xref only works inside a single registry, and I think we had good reasons for that. But as a consequence, you need to replicate everything you want to re-use into your own registry. origin seems to be focused on providing the original location of a resource, but that doesn't always help to uniquely identify the resource.

Examples:

Therefore I am thinking about an xRegistry-specific URI scheme where we could clearly define the syntax. We could define it in a way that our xid would be a relative URI reference to it. Understanding the scheme of a URI makes it easy to deterministically compare two URIs and to define mappings to different protocols. We could define how to create an xRegistry REST URL or an AMQP message from an xreg URI.