Closed siuc-nate closed 5 years ago
I suggest we tackle this step by step:
I am certainly open to other ideas for handling this, however.
Recording progress so far:
The issue seems to settle into three major (and overlapping) categories:
The core of the solution to all three of these is to make the payload structure of data in the registry contain a @graph
rather than just being the relevant top-level object. For instance, instead of:
{
"envelopeID": "[GUID]",
"decoded_payload": {
"@context": { ... },
"@type": "ceterms:Certificate",
... //Other properties
}
}
You would use:
{
"envelopeID": "[GUID]",
"decoded_payload": {
"@context": { ... },
"@graph": [
{
"@type": "ceterms:Certificate",
...//Other properties
}
]
}
}
Note that the @context
block is moved to the @graph
level, and is no longer inside of the Credential.
What this enables:
@context
to be { "@type": "@id" }
, which further enables:
{ "@id": "http://..." }
objects all over the schema@context
@graph
in the same envelope, which enables:
@graph
(we would want to be careful about this, as we would also need to include information in the @context
to explain it)@graph
would be considered being a part of the metadata itself or not)So I think it largely solves the problems we have. However, it does require changes across our systems:
@graph
, which is an array of objects where 1-n of said objects will be top-level data
@graph
@graph
and having multiple language descriptions@graph
s where there is a Credential and an AssessmentProfile and a LearningOpportunityProfile and a CostManifest in the same @graph
- each of these should have its own envelope, @graph
, CTID, etc.)/resources/[CTID]
) will have to be updated to accommodate the new structure@graph
s and disable language maps (including various @context
updates)@graph
s@graph
sThat's all I can think of off the top of my head. Feel free to comment/expand/etc.
Can we try to come to a consensus on how best to move forward?
So far I believe the solution, part 1 is to make the Registry payload an object that contains a @graph
that contains the rest:
{
"envelope_id": "",
"decoded_resource": {
"@graph": [
{ ... }, //Top-level CTDL object
{ ... }, //bnode (aka "reference" or "pointer" object)
{ ... }, //bnode (aka "reference" or "pointer" object)
]
}
}
{
"envelope_id": "",
"decoded_resource": {
"@graph": [
{ ... }, //Competency Framework
{ ... }, //Competency
{ ... }, //Competency
]
}
}
And the solution, part 2 is to use the @language
property in the @context
. But where does that @context
go?
{
"envelope_id": "",
"decoded_resource": {
"@context": { ... }, //@context at the @graph level
"@graph": [
{ ... }, //Top-level CTDL object
{ ... }, //bnode (aka "reference" or "pointer" object)
{ ... }, //bnode (aka "reference" or "pointer" object)
]
}
}
{
"envelope_id": "",
"decoded_resource": {
"@graph": [
{ //Top-level CTDL object
"@context": { ... }, //@context at the top-level-in-the-@graph level
...
},
{ ... }, //bnode (aka "reference" or "pointer" object - would these need a @context?)
{ ... }, //bnode (aka "reference" or "pointer" object - would these need a @context?)
]
}
}
I think the answer is "both":
{
"envelope_id": "",
"decoded_resource": {
"@context": { ... }, //@context at the @graph level, defines schema and default @language
"@graph": [
{ //Top-level CTDL object
"@context": { ... }, //@context at the top-level-in-the-@graph level, exists only if necessary (e.g. to provide an alternate language - everything else is inherited from the @graph level)
...
},
{ ... }, //bnode (aka "reference" or "pointer" object - @context inherited from @graph level)
{ ... }, //bnode (aka "reference" or "pointer" object - @context inherited from @graph level)
]
}
}
This also enables a rare use case such as having a subset of competencies in a framework that have metadata in two languages (if the entire framework has two metadata languages, it might be better to publish a separate payload altogether with separate CTIDs and relate them with ceasn:exactAlignment
?)
{
"envelope_id": "",
"decoded_resource": {
"@context": { //@context at the @graph level, defines schema and default @language
"@language": "en",
...
},
"@graph": [
{ ... }, //Competency Framework
{ ... }, //Competency with english metadata
{ //Same CTID/Competency as above, but with french metadata
"@context": { "@language": "fr" },
...
},
{ ... }, //Competency with english metadata
{ ... }, //Competency with english metadata
]
}
}
I think this should solve the issues above. We need to decide, carefully and quickly.
My vote is still for langstrings.
https://json-ld.org/spec/latest/json-ld/#language-indexing
Not following the spec literally and in spirit would be a step towards incompatibility for users of the system.
Langstrings are in the JSON-LD spec. The break from JSON-friendliness already began when namespace scopes were included in the field names. Use of @graph is another break from JSON-friendly methods.
Using different CTIDs for different translations of the same degree would require another layer of alignment (sameAs?) that would be more expensive (and exist in fewer libraries) than langstrings.
Langstrings can be boiled out through use of JSON-LD processors and recontextualization, so if they want to go to https://credentialengineregistery.org/resources/
Anyone using CTDL data (or that uses JSON-LD) will be used to adding getters to make langstring extraction natural.
We should follow the JSON-LD spec both literally and in spirit...and that includes @graph.
On Wed, Mar 21, 2018 at 12:08 PM, Lomilar notifications@github.com wrote:
My vote is still for langstrings.
https://json-ld.org/spec/latest/json-ld/#language-indexing
Not following the spec literally and in spirit would be a step towards incompatibility for users of the system.
Langstrings are in the JSON-LD spec. The break from JSON-friendliness already began when namespace scopes were included in the field names. Use of @graph https://github.com/graph is another break from JSON-friendly methods.
Using different CTIDs for different translations of the same degree would require another layer of alignment (sameAs?) that would be more expensive (and exist in fewer libraries) than langstrings.
Langstrings can be boiled out through use of JSON-LD processors and recontextualization, so if they want to go to https:// credentialengineregistery.org/resources/?lang=en or use the header accept-language, those still are possible if langstrings are used, but not if the different language-records have different CTIDs.
Anyone using CTDL data (or that uses JSON-LD) will be used to adding getters to make langstring extraction natural.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/CredentialEngine/vocabularies/issues/521#issuecomment-375061792, or mute the thread https://github.com/notifications/unsubscribe-auth/ACzYpg-oc4UiXWcAuGyEYAm-0O93uY-xks5tgqVHgaJpZM4Sm7Ig .
-- Stuart A. Sutton, Metadata Consultant Associate Professor Emeritus, University of Washington Information School Email: stuartasutton@gmail.com Skype: sasutton
Use of @language
in the @context
is in the spec:
https://json-ld.org/spec/latest/json-ld/#string-internationalization
The currently-proposed solution has different language versions of metadata using the same CTID. See https://github.com/CredentialEngine/vocabularies/issues/514#issuecomment-372769710
Language maps are, in my opinion, also a burden to any publisher or consumer that is only interested in one language. They make examples harder to understand and documentation trickier to write.
Can you clarify what you mean about @graph
being a break from JSON (did you mean a break from JSON-LD?) - I want to make sure I'm interpreting you correctly there.
You are absolutely correct that @language is in the spec, and is intended for indicating the default language of a non-multi-lingual object.
If the @graph of same-CTID objects only applies to the envelope, I don't think that's a problem because there's very little use for the envelope beyond book-keeping. Returning a @graph of objects if I navigate to https://cer/resources/<CTID>
though is unacceptable (unless
I'm also not sure what this pattern does for signatures. If the @id url returns a different resource than was signed by the envelope, how do I computationally verify the signatures?
--
@graph being a break from JSON
JSON-LD was intended to, among many things, serve as a bridge between JSON and RDF, allowing transforms from JSON objects to RDF (JSON-LD) and back with minimum fuss (via very complex @context).
This is, for instance, how we transform IMS CASE (which is just JSON) to CASS/Schema JSON-LD.
http://schema.cassproject.org/0.3/case2cass
and back
http://schema.cassproject.org/0.3/cass2case
Unfortunately, this capability is limited to fairly shallow mappings. Changes in structure, linking, use of statement sets (@graph), langstring transformations, etc (such as in example 39 in the spec) have varying success, but often require coding.
"@graph breaking from JSON" simply referred to using features of JSON-LD that distanced it from being plain-JSON compatible. I suppose the JSON version of an @graph would be an array of objects, or a result object with status and an array of results... but those transforms are not supported by current JSON-LD processors, I don't think.
Thanks for the clarification on @graph
and JSON/JSON-LD.
If I could expand for a moment on another part of my reasoning for using @language
in the @context
: Back when we were first looking at the problem, we liked that it kept the data simple, especially with regard to documents that only have a single language version, but the problem it presented was that you would need a @graph
of top-level documents if you had more than one language (even if those documents had the same CTID). That wasn't enough of a justification at the time to make the switch to @graph
, so we proceeded with language maps instead.
Some time thereafter, we determined that the best way to handle blank nodes would be through the use of a @graph
at the root of the payload. That sort of started the ball rolling. More recently, we determined that in order to make publishing and consuming competency frameworks (and concept schemes) as easy as possible, we would also need to use a @graph
of top-level objects.
Taken together, these were a much stronger justification for using a @graph
as the root of the payload, especially since it opened the door to using @language
-@context
'd documents in a graph instead of the somewhat clunky language maps. It seemed like a way to kill three birds with one stone, hence my pushing for it at this point.
I don't deny that language maps have their place, but I think that since we're using a @graph
anyway, we might as well use it to solve the language problem (on top of the other arguments).
Anyway, that aside, your points about the CTID/envelope/signature relationship are valid, but I think the pros outweigh the cons, in the end. Let me try to address your points individually:
Returning a
@graph
of objects if I navigate tohttps://cer/resources/<CTID>
though is unacceptable (unlessis referring to a statement set)
The way this is intended to work is something close to a statement set (or perhaps exactly that, depending on how you interpret the following):
@language
)@language
)Thus the @graph
for a credential's CTID would contain data related to that credential, so this seems valid in my opinion.
Assessments, Learning Opportunities, and Organizations would function identically to Credentials in this regard.
For Competency Frameworks, the graph would contain:
@language
)@language
) (each logical competency would have its own CTID - same CTID for different language versions of the same competency, just like anything else)This may be more in-line with your concerns, but the competencies would be "extra" data that could effectively be ignored. In most cases, I think they would be useful - and this would be the most efficient way to get all competencies for a given framework, as it requires no extra computation or retrieval on the part of either the consumer or the registry itself, since they are already present in the published payload document. Alternatively, we could figure out some means of indicating whether or not you want to receive the competencies too, or just the framework (though this would still be a @graph
of objects in order to handle the languages).
@language
)This allows retrieval of a competency. For this case, the @graph
is still needed to handle multiple language use cases. To validate signatures for a competency, retrieve the framework document it's a isPartOf
and validate that. I don't know how often it would be necessary to validate signatures for competencies in the majority of cases, especially if the consumer is getting the data straight from the registry. This is a downside, admittedly, but I do think the benefits outweigh it.
I'm also not sure what this pattern does for signatures. If the
@id
url returns a different resource than was signed by the envelope, how do I computationally verify the signatures?
This should still work the way it does now - the payload that was published is the payload you get. The publisher will publish a @graph
of resources as described above, and that's what would be signed, that's what would be returned when retrieved via @id
. So nothing here should break. Or am I misinterpreting you?
Using @graph for CTDL-ASN CompetencyFrameworks makes sense because a CompetencyFramework really is a highly described statement set (were @stuartasutton redoing things, I expect he may have made CompetencyFramework extend StatementSet/@graph). Each child object is contextualized to only that CompetencyFramework (via the 2 object model and many discussions).
Concept schemes arguably have sharable concept nodes, but I don't mind either way, so treating concepts as being local to the ConceptScheme is fine as well, but they need to be individually identifiable within a scheme, so they all have an @id.
Those @ids should locate and return the concept. Similarly, the @id of a Competency should locate and return the Competency.
As far as access to the data goes, it tends to be beneficial if every object has a URL @id, because _bnodes are next to useless to web developers, because then I have to store the location of that _bnode as something like http://cer/resources/<ctid>#<bnodeId>
.
And bnodes ids are randomly generated (yes?), so you can see how that breaks real fast upon the next version publish.
@id url returns a different resource than was signed by the envelope
Got it, I wasn't clear that the payload published is the payload you get, it seemed like that was becoming a malleable concept. (or it could be, at least)
Yes, the intent would be that each concept be retrievable on its own, just as each competency should be.
Regarding bnodes: In terms of linked data and how it gets used, Stuart is probably better-equipped to respond than I am - but as a developer, I'm not sure where I would ever need to link directly to a bnode's data any more than I would ever need to link directly to the data for a ConditionProfile, or a ProcessProfile, or any of the other non-top-level classes in CTDL. What would a use case be for linking to a bnode directly? I admit I am somewhat biased due to my background in web/interface development.
In general, the bnodes thing came out of a conversation that happened because we are/were using both URIs and objects as values for the same properties, depending on whether or not there was a published, resolvable URI to reference. That was causing problems with the @context
's definition of those properties, and we eventually arrived at bnodes as the currently-proposed solution.
When determining if an individual is qualified to enter a degree program:
Individual P meets ConditionProfile C, which is one of three ConditionProfiles in the degree program (one for local students, one for national students and one for international students).
How do I describe and save that statement?
Bnodes are common fare in RDF and in JSON-LD. While nothing precludes assigning a URI to absolutely every resource in a description, we accept that bnodes have a utility within the scope of a graph--e.g., an instance of PostalAddress, ConditionProfile. After considerably going round and around, we accepted the inevitable need for bnodes for what you all are calling top-level entities when no URI is available--e.g., providing a brief description of an organization where there was no URI-named entity. Again, common fare.
But, those bnodes providing description of things like an organization were originally done without nodeID (i.e. no such thing as @id that resolved to _:12345678)--i.e., those "the infamous "reference/pointer" objects". In the end, that's what caused problems with the property declarations in the @context. So, we've added bnode nodeIDs.
So far, none of the above is unusual in RDF. It's also not unusual in JSON-LD. So, while I may be missing something, I don't see any reason for a ruckus around bnodes identified by nodeIDs and bounded by @graph.
Nate, you state: "as a developer, I'm not sure where I would ever need to link directly to a bnode's data any more than I would ever need to link directly to the data for a ConditionProfile, or a ProcessProfile, or any of the other non-top-level classes in CTDL"
Here's an example. You have a credential that references a bnode describing an organization because the org has not been URI-identified--i.e., classic example of the infamous reference/pointer' object. Now, you have more than one property referencing this organization bnode--ownedBy, offeredBy, newedBy and revokedBy. That can be done either by repeating the bnode data four times, or by assigning it a nodeID, describing it once and referencing via the nodeID as object of the four properties.
@stuartasutton Nothing we're doing here is illegal or even unusual, but _bnodes are URIs, not URLs, and URIs may be identifiable (_bnodes are only identifiable within a context, yes?) but they are not locatable.
That makes them:
Even a PostalAddress should, in some distant future, have a URL (or maybe a descriptive URI that can be generated from the fields).
As I said, some say that providing all instances of resources with URI is the way to go--no bnodes, none even with nodeIDs. That would have been doable and I would not have raised a peep. But, knowing what I know today, I would probably not advised it since it would have cascaded--all those resources would then need CTID, an envelop, a signature...and on down the rabbit hole.
By the way, in instances such as our "reference/pointer", being not "locatable" is a feature and not a bug.
By the way, in instances such as our "reference/pointer", being not "locatable" is a feature and not a bug.
100% onboard there.
Also, I should probably apologize as I argued as if we were doing this from scratch. The CTID/envelope/signature rabbit hole is understandable.
We have ours in CASS too. Versioning, what-fields-compose-the-signature-when-data-is-portable, and on.
Edit: merged my responses into a single post:
When determining if an individual is qualified to enter a degree program:
Individual P meets ConditionProfile C, which is one of three ConditionProfiles in the degree program (one for local students, one for national students and one for international students).
How do I describe and save that statement?
Do you mean in the context of CTDL generally, or was this specifically in response to me asking for a use case where I might want to provide a URI directly to that condition profile (in my case, I wouldn't - the condition profile needs the context of its credential to make any sense, otherwise they are effectively "conditions for [undefined]")
Nate, you state: "as a developer, I'm not sure where I would ever need to link directly to a bnode's data any more than I would ever need to link directly to the data for a ConditionProfile, or a ProcessProfile, or any of the other non-top-level classes in CTDL"
Here's an example. You have a credential that references a bnode describing an organization because the org has not been URI-identified--i.e., classic example of the infamous reference/pointer' object. Now, you have more than one property referencing this organization bnode--ownedBy, offeredBy, newedBy and revokedBy. That can be done either by repeating the bnode data four times, or by assigning it a nodeID, describing it once and referencing via the nodeID as object of the four properties.
That statement was in the context of Fritz asking about linking to it directly from the outside as a standalone/top-level thing. Your example seems more relevant to the context of a @graph
, where exactly that solution is already proposed.
As I said, some say that providing all instances of resources with URI is the way to go--no bnodes, none even with nodeIDs. That would have been doable and I would not have raised a peep. But, knowing what I know today, I would probably not advised it since it would have cascaded--all those resources would then need CTID, an envelop, a signature...and on down the rabbit hole.
This is correct, but it's only part of the picture - the main reason we had reference/pointer objects to begin with was to allow Entity A to describe/point to something owned by Entity B even if Entity B didn't publish data to the registry. We couldn't allow Entity A to "own" the data in the registry about something belonging to Entity B. Thus we needed something sufficiently descriptive/useful enough to point to/lightly describe Entity B's property while still clearly indicating that the data was not officially published by Entity B. Enter reference/pointer objects.
Yes, Nate, that's the policy reason...
To try to get this thread back on track a bit: Is there still objection to (or does anyone foresee problems with) the notions of:
@graph
@context
at the @graph
level (and anywhere else that it's necessary, see https://github.com/CredentialEngine/vocabularies/issues/521#issuecomment-375002285 )@language
in the @context
instead of language maps (with multiple copies of the relevant documents in each applicable language, sharing a CTID)@graph
level (and probably inside the relevant documents too) - this may not be necessary, but it may make it more convenient to lookup the records (in general, but especially if the registry's search/retrieval software is already written to expect a CTID at the root payload level)It's fine if there's still problems with these; they're critical and we need to think them through - I just want to make sure we're not digressing too much.
I can't address the "which is simpler to implement" or matters of implementing at the system level. So,
Yes
Yes
Since there are several ways to do this in JSON-LD, I leave that to you and Fritz to hammer out so long as the solution does not make multi-language data difficult down the road.
Yes (if you mean named graph (
@graph
))
Yes (if you mean named graph (
@graph
))
Yes (if you mean named graph (
@graph
); assuming bnodes here also includes reference/pointer' objects)
I've not a clue since I don't know the system constrains on its use.
@stuartasutton just to be sure - can you provide a JSON-LD example of a named graph? I believe we may be talking about the same thing.
I think we are: https://json-ld.org/spec/latest/json-ld/#named-graphs (section discussing @graph
)
Hmm...I think there may be a problem with that. I'm glad you brought it up, as this is exactly the kind of thing we need to catch and handle now rather than later. Try this:
{
"envelope_id": "ABC123",
"decoded_payload": {
"@context": [
"http://credreg.net/ctdl/schema/context/json",
{
"@language": "en"
}
],
"@id": "http://credentialengineregistry.org/resources/[CTID#123],
"@graph": [
{
"@type": "ceterms:Credential",
"@id": "http://credentialengineregistry.org/resources/[CTID#123]",
...//Other properties
}
]
}
}
There are two things I want to note here: First, use of an "advanced context" per the JSON-LD spec (scroll down to example 35 here: https://json-ld.org/spec/latest/json-ld/#advanced-context-usage ). This seems unnecessarily complex.
@language: "en"
into the document found at http://credreg.net/ctdl/schema/context/json, and/or:Note that I bring this up because:
"@context": "http://credreg.net/ctdl/schema/context/json?language=ru"
than to explain, document, publish, validate, consume, and process (even just deserializing it would be a headache since it's a list containing a string and an object):
"@context": [
"http://credreg.net/ctdl/schema/context/json",
{
"@language": "ru"
}
]
The alternative would be to just always include @language
in every document at the top of the @graph
, which is also unnecessarily complex and redundant. I would very, very strongly push for doing this via the context document's URL.
Second, and more problematic, is the use of the same @id
for both the @graph
and the ceterms:Credential
- this seems like invalid JSON-LD to me. Since the @graph
is intended to express the data that would otherwise have to be encoded in the Credential document itself (i.e. the blank nodes), would it be correct to remove the @id
from the Credential document altogether? Then everything in the graph would effectively become a blank node, which also seems like a bad idea.
Thoughts on how to solve this? Keep in mind that we want whoever retrieves the data by its CTID to retrieve the entire graph.
Do you mean in the context of CTDL generally, or was this specifically in response to me asking for a use case where I might want to provide a URI directly to that condition profile (in my case, I wouldn't - the condition profile needs the context of its credential to make any sense, otherwise they are effectively "conditions for [undefined]")
From a user or observer's standpoint, yes, but from a processor's standpoint, it doesn't need to know what the credential is to determine if someone is qualified for this condition profile. Even if it did in the course of processing, it doesn't need to keep that data. It just needs to store a statement that says that a student is qualified for this condition profile. If the credential is presented without _bnodes (just as nested objects) then it would have to use some sort of XPath/JSONPath to identify the child object in the credential, or in the case of _bnodes, its bnode ID. Both are more fragile and more complex than URLs.
We probably don't need to pursue this argument further for now.
is the use of the same @id for both the @graph and the ceterms:Credential - this seems like invalid JSON-LD to me.
A statement set is a thing with (optionally) its own @id. You are correct in saying this is a problem, as it could break caching systems (do I cache the @graph or the Credential in the @id's slot?) and all sorts of things.
I'd recommend using a @id-less graph, or setting the @id of the graph to something different. The graph is just being used to return a container of results, similar to a JSON array of results in traditional JSON APIs. It's common enough practice.
I would say the envelope it came in is another candidate for the @id of the graph... but I think the envelope is also not a named graph.
I was looking through the JSON-LD spec again, and there may be another option that I think came up in some form much earlier in the thread (albeit without the JSON-LD spec notion behind it): use of @graph
containers: https://json-ld.org/spec/latest/json-ld/#graph-containers - also, apparently the scope of bnodes is the document rather than the @graph
( https://json-ld.org/spec/latest/json-ld/#identifying-blank-nodes ), so the example below should be valid.
It would require addition of a meta property defined as "@container": "@graph"
, but "meta" is already in the @context
, so then we could do something like:
{
"decoded_payload": {
"@context": "http://credreg.net/ctdl/schema/context/json?language=ru",
"@type": "ceterms:Credential",
"@id": "http://credentialengineregistry.org/resources/[CTID#123]",
"ceterms:ctid": "[CTID#123]",
"ceterms:requires": [
{
"ceterms:targetAssessment": [
"_:ABC", "_:DEF"
]
}
],
"meta:references": [
{
"@id": "_:ABC",
...//Other properties
},
{
"@id": "_:DEF",
...//Other properties
}
]
}
}
But then we'd be back to the problem of using language maps (or requiring different-language versions of the data to be published as separate documents/envelopes/CTIDs altogether). I would instead lean towards using either an unnamed graph, or naming the graph but not the Credential...actually, consider this:
{
"decoded_payload": {
"@context": "http://credreg.net/ctdl/schema/context/json?language=en",
"@id": "http://credentialengineregistry.org/resources/[CTID#123]",
"@graph": [
{
"@id": "http://credentialengineregistry.org/resources/[CTID#123]",
"@type": "ceterms:Credential"
},
{
"@context": {
"@language": "ru"
},
"@id": "http://credentialengineregistry.org/resources/[CTID#123]",
"@type": "ceterms:Credential"
},
{
"@context": {
"@language": "fr"
},
"@id": "http://credentialengineregistry.org/resources/[CTID#123]",
"@type": "ceterms:Credential"
}
]
}
}
We got hung up on making sure the CTIDs would be the same in the language issue (#514) and I think we forgot about the @id
s therefore also being the same. Even if you didn't use a named graph here (no @id
at the @graph
level), you'd still be stuck with 1-n documents with the same @id
.
Going back to my earlier post, retrieving a document from the registry by its CTID (and therefore its @id
) would need to give you the entire @graph
- which is perhaps instead an argument in favor of using a named graph and not assigning @id
s to the top-level documents inside it? Or maybe you combine the two approaches?:
{
"decoded_payload": {
"@context": "http://credreg.net/ctdl/schema/context/json?language=en",
"@id": "http://credentialengineregistry.org/resources/[CTID#123]",
"meta:rootDocuments": [
{
"@id": "_:Main1",
"@type": "ceterms:Credential"
},
{
"@context": {
"@language": "ru"
},
"@id": "_:Main2",
"@type": "ceterms:Credential"
},
{
"@context": {
"@language": "fr"
},
"@id": "_:Main3",
"@type": "ceterms:Credential"
}
],
"meta:references": [
{
"@id": "_:ABC",
...//Other properties
},
{
"@id": "_:DEF",
...//Other properties
}
]
}
}
But maybe that's overengineering it a bit too much.
Dang, it seemed like we were so close to solving this.
Another approach (tell me if this sounds too crazy): Use index containers ( https://json-ld.org/spec/latest/json-ld/#data-indexing ) as a sort of "super" language map (blame the example in the spec itself for inspiring this one). For this example, assume the context includes "meta:text": { "@type": "@index" }
{
"decoded_payload": {
"@context": "http://credreg.net/ctdl/schema/context/json",
"@id": "http://credentialengineregistry.org/resources/[CTID#123]",
"@type": "ceterms:Credential",
"meta:text": {
"en": {
"ceterms:name": "My Credential",
"ceterms:description": "Text describing this credential",
... // Other language-dependent properties
},
"ru": {
"ceterms:name": "Мои полномочия",
"ceterms:description": "Текст, описывающий эти учетные данные",
... // Other language-dependent properties
}
},
"ceterms:requires": [
{
"@type": "ceterms:ConditionProfile",
"meta:text": {
"en": {
"ceterms:name": "My conditions",
"ceterms:description": "Descriptions for earning the credential",
"ceterms:condition": [
"Would text in a list",
"also be logically assumed to be in the indicated language",
"in a context like this?",
"Or would each of these lines need its own {en} wrapper?"
]
},
"ru": {
"ceterms:name": "Мои условия",
"ceterms:description": "Описания для получения удостоверений",
"ceterms:condition": [
"Будет ли текст в списке",
"также логически предполагается, что они указаны на указанном языке",
"в таком контексте?",
"Или каждая из этих строк нуждается в собственной оболочке {ru}?"
]
},
},
"ceterms:yearsOfExperience": 9,
"ceterms:targetAssessment": [
"_:Assessment1",
"_:Assessment2"
]
}
],
"meta:references": [
{
"@type": "ceterms:AssessmentProfile",
"@id": "_:Assessment1",
"meta:text": {
"en": {
"ceterms:name": "Some assessment we don't own",
"ceterms:description": "Text describing it"
},
"ru": {
"ceterms:name": "Некоторая оценка, которой мы не владеем",
"ceterms:description": "Текст, описывающий это"
}
},
"ceterms:subjectWebpage": "http://..."
}
{
... // Properties for _:Assessment2
}
]
}
}
This gives you the benefits of a language map in terms of being able to avoid duplicating all of the non-language-dependent data, avoids the @graph
problem altogether (when paired with a meta property defined as "@container": "@graph"
( https://json-ld.org/spec/latest/json-ld/#graph-containers ), which could also be used to hold competencies, concepts, etc. in the context of frameworks and schemes), and solves the problem of where @id
should live and what it should retrieve. It also avoids the complexity of doing language maps for every single property, as the @type
of each language-dependent is still xsd:string
, and the JSON itself is a lot easier to publish and read (in my opinion).
There's just one problem - according to the spec, @index
is meant to be used with properties that are semantically ignored, meaning (if I'm interpreting it correctly) that even though the example in the spec itself uses it to create these sort of "super" language maps, that this wouldn't be a valid use of @index
if your goal is to semantically provide such "super" language maps.
So close, yet so far. Maybe handling that in the definition of meta:text
would work?
I'm surprised there's no way to semantically do language maps this way, given how cantankerous the vanilla language maps are - I assume this would be something like { "meta:text": { "@type": "@language" } }
, but anything resembling that designation doesn't seem to show up anywhere in the schema. This proposal appears to have come up in the discussion of the spec that led to language maps as they are, and was shot down in favor of the approach we already explored (using a @graph
of documents where each has its own @language
in the @context
) - so I'm not sure where that leaves us.
As a sanity check, I did a language map version of the above example, and I guess it isn't all that different in terms of overall complexity (although the per-term usage of language maps still presents the difficulties I've outlined before):
{
"decoded_payload": {
"@context": "http://credreg.net/ctdl/schema/context/json",
"@id": "http://credentialengineregistry.org/resources/[CTID#123]",
"@type": "ceterms:Credential",
"ceterms:name": {
"en": "My Credential",
"ru": "Мои полномочия"
},
"ceterms:description": {
"en": "Text describing this credential",
"ru": "Текст, описывающий эти учетные данные"
},
// Other language-dependent properties
"ceterms:requires": [
{
"@type": "ceterms:ConditionProfile",
"ceterms:name": {
"en": "My conditions",
"ru": "Мои условия"
},
"ceterms:description": {
"en": "Descriptions for earning the credential",
"ru": "Описания для получения удостоверений"
},
"ceterms:condition": [
{
"en": "Would text in a list",
"ru": "Будет ли текст в списке",
},
{
"en": "also be logically assumed to be in the indicated language",
"ru": "также логически предполагается, что они указаны на указанном языке"
},
{
"en": "in a context like this?",
"ru": "в таком контексте?"
},
{
"en": "Or would each of these lines need its own {en} wrapper?",
"ru": "Или каждая из этих строк нуждается в собственной оболочке {ru}?"
}
],
"ceterms:yearsOfExperience": 9,
"ceterms:targetAssessment": [
"_:Assessment1",
"_:Assessment2"
]
}
],
"meta:references": [
{
"@type": "ceterms:AssessmentProfile",
"@id": "_:Assessment1",
"ceterms:name": {
"en": "Some assessment we don't own",
"ru": "Текст, описывающий это"
},
"ceterms:description": {
"en": "Text describing it",
"ru": "Текст, описывающий это"
},
"ceterms:subjectWebpage": "http://..."
},
{
... // Properties for _:Assessment2
}
]
}
}
So maybe the approach in my post above isn't all that helpful - I'll leave it there for the sake of documentation nonetheless.
@siuc-nate @stuartasutton @Lomilar @jkitchensSIUC Decision Request! There are many topics in this thread. The most pressing decision request relates to competencies. For my testing (via github), I had been using the current approach of separate schemas for the framework and the competency. I have not requested these to be updated in the registry sandbox, giving the uncertainty of the final approach. The API will have a specific endpoint for publishing from CASS, via the CTDL publisher. If we are going to go with the @graph approach, I will need to:
So, for competencies only, what is the decision:
I came up with some fuller, more realistic examples. I was going to do some other ones (namely competency framework related ones) but I think these were the last nail in the coffin for the non-language-map approach. They revealed something that wasn't very obvious from the basic examples so far.
Specifically, while it is great if you only have one language:
Credential + 2 bnodes - one language
{
"envelope_id": ".../123",
"decoded_payload": {
"@context": "https://credreg.net/ctdl/schema/context/json?language=en",
"@id": "https://credentialengineregistry.org/resources/[CTID#123]",
"@graph": [
{
"@id": "https://credentialengineregistry.org/resources/[CTID#123]/en",
"@type": "ceterms:Certificate",
"ceterms:name": "My Credential",
"ceterms:description": "Description of this credential",
"ceterms:subjectWebpage": "http://credreg.net",
"ceterms:keyword": [
"keyword 1",
"keyword 2",
"keyword 3"
],
"ceterms:ownedBy": [
"https://credentialengineregistry.org/resources/[CTID#456]"
],
"ceterms:audienceLevelType": [
{
"@type": "ceterms:CredentialAlignmentObject",
"ceterms:targetNodeName": "Beginner",
"ceterms:targetUrl": "https://credreg.net/ctdl/vocabs/audLevel/BeginnerLevel"
},
{
"@type": "ceterms:CredentialAlignmentObject",
"ceterms:targetNodeName": "Bachelors Degree Level",
"ceterms:targetUrl": "https://credreg.net/ctdl/vocabs/audLevel/BachelorsDegreeLevel"
},
],
"ceterms:requires": [
{
"@type": "ceterms:ConditionProfile",
"ceterms:description": "This describes the conditions",
"ceterms:condition": [
"condition one",
"condition two"
],
"ceterms:yearsOfExperience": 5,
"ceterms:targetCompetency": [
{
"@type": "ceterms:CredentialAlignmentObject",
"ceterms:targetNodeDescription": "Text of the competency",
"ceterms:targetUrl": "https://credentialengineregistry/resources/[CTID#789]"
}
],
"ceterms:targetAssessment": [
"_:AssessmentABC",
]
}
],
"ceterms:isAdvancedStandingFor": [
{
"@type": "ceterms:ConditionProfile",
"ceterms:description": "This credential is advanced standing for the other credential",
"ceterms:targetCredential": [
"_:CredentialABC"
]
}
]
},
{
"@id": "_:AssessmentABC",
"@type": "ceterms:AssessmentProfile",
"ceterms:name": "Name of the assessment",
"ceterms:subjectWebpage": "http://somesite.org/abc"
},
{
"@id": "_:CredentialABC",
"@type": "ceterms:Certification",
"ceterms:name": "Name of the credential",
"ceterms:subjectWebpage": "http://someothersite.org/abc"
}
]
}
}
...It's overly verbose when more get involved, since a lot of the properties aren't language-dependent, resulting in more duplicate data than I thought there would be:
Credential + 2 bnodes - three languages
{
"envelope_id": ".../123",
"decoded_payload": {
"@context": "https://credreg.net/ctdl/schema/context/json",
"@id": "https://credentialengineregistry.org/resources/[CTID#123]",
"@graph": [
{
"@context": {
"@language": "en"
},
"@id": "https://credentialengineregistry.org/resources/[CTID#123]/en",
"@type": "ceterms:Certificate",
"ceterms:name": "My Credential",
"ceterms:description": "Description of this credential",
"ceterms:subjectWebpage": "http://credreg.net",
"ceterms:keyword": [
"keyword 1",
"keyword 2",
"keyword 3"
],
"ceterms:ownedBy": [
"https://credentialengineregistry.org/resources/[CTID#456]"
],
"ceterms:audienceLevelType": [
{
"@type": "ceterms:CredentialAlignmentObject",
"ceterms:targetNodeName": "Beginner",
"ceterms:targetUrl": "https://credreg.net/ctdl/vocabs/audLevel/BeginnerLevel"
},
{
"@type": "ceterms:CredentialAlignmentObject",
"ceterms:targetNodeName": "Bachelors Degree Level",
"ceterms:targetUrl": "https://credreg.net/ctdl/vocabs/audLevel/BachelorsDegreeLevel"
},
],
"ceterms:requires": [
{
"@type": "ceterms:ConditionProfile",
"ceterms:description": "This describes the conditions",
"ceterms:condition": [
"condition one",
"condition two"
],
"ceterms:yearsOfExperience": 5,
"ceterms:targetCompetency": [
{
"@type": "ceterms:CredentialAlignmentObject",
"ceterms:targetNodeDescription": "Text of the competency",
"ceterms:targetUrl": "https://credentialengineregistry/resources/[CTID#789]"
}
],
"ceterms:targetAssessment": [
"_:AssessmentABC",
]
}
],
"ceterms:isAdvancedStandingFor": [
{
"@type": "ceterms:ConditionProfile",
"ceterms:description": "This credential is advanced standing for the other credential",
"ceterms:targetCredential": [
"_:CredentialABC"
]
}
]
},
{
"@context": {
"@language": "ru"
},
"@id": "https://credentialengineregistry.org/resources/[CTID#123]/ru",
"@type": "ceterms:Certificate",
"ceterms:name": "Мои полномочия",
"ceterms:description": "Описание этих учетных данных",
"ceterms:subjectWebpage": "http://credreg.net",
"ceterms:keyword": [
"ключевое слово 1",
"ключевое слово 2",
"ключевое слово 3"
],
"ceterms:ownedBy": [
"https://credentialengineregistry.org/resources/[CTID#456]"
],
"ceterms:audienceLevelType": [
{
"@type": "ceterms:CredentialAlignmentObject",
"ceterms:targetNodeName": "начинающий",
"ceterms:targetUrl": "https://credreg.net/ctdl/vocabs/audLevel/BeginnerLevel"
},
{
"@type": "ceterms:CredentialAlignmentObject",
"ceterms:targetNodeName": "степень бакалавра",
"ceterms:targetUrl": "https://credreg.net/ctdl/vocabs/audLevel/BachelorsDegreeLevel"
},
],
"ceterms:requires": [
{
"@type": "ceterms:ConditionProfile",
"ceterms:description": "это описывает условия",
"ceterms:condition": [
"условие один",
"условие два"
],
"ceterms:yearsOfExperience": 5,
"ceterms:targetCompetency": [
{
"@type": "ceterms:CredentialAlignmentObject",
"ceterms:targetNodeDescription": "текст компетенции",
"ceterms:targetUrl": "https://credentialengineregistry/resources/[CTID#789]"
}
],
"ceterms:targetAssessment": [
"_:AssessmentABC",
]
}
],
"ceterms:isAdvancedStandingFor": [
{
"@type": "ceterms:ConditionProfile",
"ceterms:description": "Эти полномочия расширены для других учетных данных",
"ceterms:targetCredential": [
"_:CredentialABC"
]
}
]
},
{
"@context": {
"@language": "es"
},
"@id": "https://credentialengineregistry.org/resources/[CTID#123]/es",
"@type": "ceterms:Certificate",
"ceterms:name": "Mi credencial",
"ceterms:description": "Descripción de esta credencial",
"ceterms:subjectWebpage": "http://credreg.net",
"ceterms:keyword": [
"palabra clave 1",
"palabra clave 2",
"palabra clave 3"
],
"ceterms:ownedBy": [
"https://credentialengineregistry.org/resources/[CTID#456]"
],
"ceterms:audienceLevelType": [
{
"@type": "ceterms:CredentialAlignmentObject",
"ceterms:targetNodeName": "principiante",
"ceterms:targetUrl": "https://credreg.net/ctdl/vocabs/audLevel/BeginnerLevel"
},
{
"@type": "ceterms:CredentialAlignmentObject",
"ceterms:targetNodeName": "nivel de licenciatura",
"ceterms:targetUrl": "https://credreg.net/ctdl/vocabs/audLevel/BachelorsDegreeLevel"
},
],
"ceterms:requires": [
{
"@type": "ceterms:ConditionProfile",
"ceterms:description": "esto describe las condiciones",
"ceterms:condition": [
"condición uno",
"condición dos"
],
"ceterms:yearsOfExperience": 5,
"ceterms:targetCompetency": [
{
"@type": "ceterms:CredentialAlignmentObject",
"ceterms:targetNodeDescription": "texto de la competencia",
"ceterms:targetUrl": "https://credentialengineregistry/resources/[CTID#789]"
}
],
"ceterms:targetAssessment": [
"_:AssessmentABC",
]
}
],
"ceterms:isAdvancedStandingFor": [
{
"@type": "ceterms:ConditionProfile",
"ceterms:description": "esta credencial es avanzada para la otra credencial",
"ceterms:targetCredential": [
"_:CredentialABC"
]
}
]
},
{
"@context": {
"@language": "en"
},
"@id": "_:AssessmentABC",
"@type": "ceterms:AssessmentProfile",
"ceterms:name": "Name of the assessment",
"ceterms:subjectWebpage": "http://somesite.org/abc"
},
{
"@context": {
"@language": "en"
}
"@id": "_:CredentialABC",
"@type": "ceterms:Certification",
"ceterms:name": "Name of the credential",
"ceterms:subjectWebpage": "http://someothersite.org/abc"
},
{
"@context": {
"@language": "ru"
},
"@id": "_:AssessmentABC",
"@type": "ceterms:AssessmentProfile",
"ceterms:name": "название оценки",
"ceterms:subjectWebpage": "http://somesite.org/abc"
},
{
"@context": {
"@language": "ru"
}
"@id": "_:CredentialABC",
"@type": "ceterms:Certification",
"ceterms:name": "имя учетных данных",
"ceterms:subjectWebpage": "http://someothersite.org/abc"
},
{
"@context": {
"@language": "es"
},
"@id": "_:AssessmentABC",
"@type": "ceterms:AssessmentProfile",
"ceterms:name": "nombre de la evaluación",
"ceterms:subjectWebpage": "http://somesite.org/abc"
},
{
"@context": {
"@language": "es"
}
"@id": "_:CredentialABC",
"@type": "ceterms:Certification",
"ceterms:name": "nombre de la credencial",
"ceterms:subjectWebpage": "http://someothersite.org/abc"
}
]
}
}
Based on our discussions and examples (including this last one), the fact that we were planning to implement language maps anyway, the fact that CASS already uses them, their use in the JSON-LD spec, and our very pressing need to move forward, I'm afraid I'll have to give in on this one and hope we can work out a good way to explain language maps to developers/partners after the fact. I am glad we at least explored other options, as I would have been left wondering "what if" otherwise. Thanks for all the thought-provoking feedback/pushback from the rest of you, as well.
Anyway, between that and all of us (I think) being on board with using @graph
, it looks like the current solutions are:
@graph
and @context
in the root of the payload@graph
(#508)@graph
as their framework (#522)@graph
as their concept scheme (#522)Which leaves, unless I'm missing something, just one problem:
Does a URI like
https://credentialengineregistry.org/resources/ce-[UUID]
belong at the @graph
level, or in the (main) object inside the graph?
@stuartasutton had suggested putting that URI inside the main object and using
https://credentialengineregistry.org/graph/ce-[UUID]
for the graph, and that may well solve it, but is there ever a case where you wouldn't want the other contents of the graph (the bnodes, and/or competencies)?
Consider also that we may want to reserve an endpoint like /graph/
for a service that either searches the registry in a graph-like fashion, and/or retrieves (on-demand by crawling the links) the entire description set for a given CTID rather than just the stuff that was directly published with it in its @graph
.
Given that, with the use of language maps, there would only be one main document in the @graph
, maybe it's acceptable to just not give the main document an @id
at all? Or to give it one that hangs off of the @graph
's URI, e.g.:
https://credentialengineregistry.org/resources/ce-[UUID]
for the @graph
, and
https://credentialengineregistry.org/resources/ce-[UUID]/top
(or /main
or /core
or whatever) for the primary document?
The graph may be called a "named graph" but it doesn't have to have an @id
. If you want to name it, @stuartasutton 's suggestion sounds good.
Coming at this from the outside I would expect, as a naive developer:
@graph
, that's probably okay. It's like asking for a a bus and getting back a box of legos. Not a big deal, I can probably put it together or employ a JSON-LD processor to reassemble it. It's very similar to every other web service that returns a 'result object' with information that should be in the HTTP response header.@graph
is caching this data somewhere, which would be a 1-1 map of @id
to {...}
. So, everything in the @graph
should have a unique ID with no duplicates. Again, I'm probably throwing away the @id
of the @graph
because it isn't important, just the data inside is important.Also note:
@id
points at, not the envelope or description set it came in. If you give me extra stuff, I won't know why right off the bat. I may be able to figure it out.@ids
change, because I'm storing those as records in a NoSQL database etc. That's all the opinions I got.
Thanks, Nate. I know this has not been easy; but, coming, to this conclusion on your own is beneficial.
You state:
"Does a URI like https://credentialengineregistry.org/resources/ce-[UUID] belong at the @graph level, or in the (main) object inside the graph?"
Response: On the main object in the @graph
.
"@stuartasutton had suggested putting that URI inside the main object and using
https://credentialengineregistry.org/graph/ce-[UUID]
for the graph"
Response: I am still of that opinion that if the @graph
is to have a URI at all, it should be as I suggested. It's simple to explain and keeps our resources URI consistent. So, we'd have:
https://credentialengineregistry.org/graph/ce-[UUID]
(resolve to content of the @graph--i.e., the description set).
https://credentialengineregistry.org/resource/ce-[UUID]
(resolve to the content of the single object identified by the URI).
I'm not convinced that this use of /graph/
might have a better use. So in the end, we'd have:
https://credentialengineregistry.org/graph/ce-6d62b61a-033c-417a-9d53-ad930857465b
https://credentialengineregistry.org/resource/ce-6d62b61a-033c-417a-9d53-ad930857465b
Simple to explain:
The
/resource/
URI returns exactly what you are asking for with the URI. The/graph/
URI returns a description set of closely related entities (encompassed by the@graph
).
Whether the @graph
should be named by URI is not mandatory; but, it does buy you the functionality of being able to reference the full contents of the @graph
as described above.
/graph/
returning the graph the object came in is good.
The /resource/ URI returns exactly what you are asking for with the URI.
It can't. It has to return a @graph
(not necessarily the original graph) that contains the object requested along with any _bnodes that are referenced by that object. :-/
Otherwise, agreed.
This seems like something that should be covered by the spec - unless you just mean that /resource/
should return the @graph
and the URI of the main thing in the @graph
should be something else (or blank)?
/resource/
cannot return just the object. It has to return a statement set (@graph) with at least the object and any bnodes within, because those bnodes aren't locatable.
So, /resource/
isn't returning exactly what you are asking for, it's returning a statement set with what you asked for inside.
While we're at it, I should probably bring up:
If I get a graph back with the object flattened into a @graph, that's probably okay. It's like asking for a a bus and getting back a box of legos. Not a big deal, I can probably put it together or employ a JSON-LD processor to reassemble it. It's very similar to every other web service that returns a 'result object' with information that should be in the HTTP response header.
The way things are currently written, the only parts you'd need to assemble would be the bnodes that serve as references to top-level objects that don't exist in the registry. Everything else (the many ____Profiles) will be structurally a part of the main JSON document. If that's a problem, we need to solve it now. Otherwise, it should make the data easier to work with.
I'm talking about what you'd get with RDF from a quadstore. To get back everything within the @graph
, you'd need to resolve the 4rd member of the quad since it identifies the the full graph. See https://json-ld.org/spec/latest/json-ld/#named-graphs and check the accompanying data table. To get back everything within the bounds of the @graph
you'd need to retrieve everything with a domain of _:graph
(if it were a full URI and not a bnode). To retrieve everything describing Manu, you'd return those triples with a domain of http://manu.sporny.org/about#manu
. To return all the information about Gregg, you'd retrieve those triples with a domain of http://greggkellogg.net/foaf#me
. I think that comports with what I said.
Got it. Graph URL being a first order element in N-Quads and additional structure in JSON-LD is where I lost understanding.
What you said comports from the RDF side, less from the naive JSON/API consumer/developer side. I didn't know until yesterday what the difference between Triples and Quads was.
To try to summarize where everyone is: Given this data:
{
"decoded_payload": {
"@context": "http://credreg.net/ctdl/schema/context/json",
"@id": "https://credentialengineregistry.org/graph/ce-123",
"@graph": [
{
"@type": "Credential",
"@id": "https://credentialengineregistry.org/resources/ce-123",
"ceterms:requires": [
{
"@type": "ceterms:ConditionProfile",
"ceterms:targetAssessment": [
"https://credentialengineregistry.org/resources/ce-890",
"_:ABC",
"_:DEF"
]
}
]
},
{
"@id": "_:ABC",
"@type": "ceterms:AssessmentProfile"
},
{
"@id": "_:DEF"
"@type": "ceterms:AssessmentProfile"
}
]
}
}
Fill in the blanks:
Resolving https://credentialengineregistry.org/graph/ce-123
returns:
Resolving https://credentialengineregistry.org/resources/ce-123
returns:
The canonical @id
of the credential therefore is:
Nate, I am in transit to Toronto all day. Will try and get to this this evening or early tomorrow morning. If I have WiFi and time at SFO airport this am, I’ll respond then.
Sent from my iPhone
On Apr 3, 2018, at 2:51 PM, siuc-nate notifications@github.com wrote:
To try to summarize where everyone is: Fill in the blanks:
Resolving https://credentialengineregistry.org/graph/ce-123 returns:
Resolving https://credentialengineregistry.org/resources/ce-123 returns:
The canonical @id of the credential therefore is:
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Resolving
https://credentialengineregistry.org/graph/ce-123
returns:
The above, as is.
Resolving https://credentialengineregistry.org/resources/ce-123 returns:
{ "@context": "http://credreg.net/ctdl/schema/context/json", "@id": "https://credentialengineregistry.org/resources/ce-123", "@type": "ceterms:Credential", "ceterms:requires": { "@id": "_:b0", "@type": "ceterms:ConditionProfile", "ceterms:targetAssessment": [ "https://credentialengineregistry.org/resources/ce-890", { "@id": "_:b1", "@type": "ceterms:AssessmentProfile" }, { "@id": "_:b2", "@type": "ceterms:AssessmentProfile" } ] } }
This was accomplished by first changing the context so it included targetAssessment
@type:@id
, then framing the@graph
, fetching the resource out of the frame with@id:".../resources/ce-123"
and compacting it.
Framing and Compaction are very typical JSON-LD processor transforms available in all sorts of libraries. They are powerful but finicky (like most RDF transforms and processors). Based on previous conversations, the context should probably be updated so that @id
is transformed to id
and @type
is transformed to type
. I'd prefer if all the objects in here had URLs for URIs, but that's probably not a big deal.
With just context changes and the application of the JSON-LD processor, the above can become:
{
"@context": "http://credreg.net/ctdl/schema/context/json",
"id": "https://credentialengineregistry.org/resources/ce-123",
"type": "ceterms:Credential",
"ceterms:requires": {
"id": "_:b0",
"type": "ceterms:ConditionProfile",
"ceterms:targetAssessment": [
"https://credentialengineregistry.org/resources/ce-890",
{
"id": "_:b1",
"type": "ceterms:AssessmentProfile"
},
{
"id": "_:b2",
"type": "ceterms:AssessmentProfile"
}
]
}
}
And if the default @vocab
is set in the context to ceterms, it could become:
{
"@context": "http://credreg.net/ctdl/schema/context/json",
"id": "https://credentialengineregistry.org/resources/ce-123",
"type": "Credential",
"requires": {
"id": "_:b0",
"type": "ConditionProfile",
"targetAssessment": [
"https://credentialengineregistry.org/resources/ce-890",
{
"id": "_:b1",
"type": "AssessmentProfile"
},
{
"id": "_:b2",
"type": "AssessmentProfile"
}
]
}
}
which is as close to a JSON as I can make it.
I would personally go through and remove all the _:b*
id
s so that they don't confuse people, another step to making it easy to understand. It may not be necessary though.
P.S. This kind of stuff is how CASS translates from one schema to another.
{
"@context": "http://credreg.net/ctdl/schema/context/json",
"id": "https://credentialengineregistry.org/resources/ce-123",
"type": "Credential",
"requires": {
"type": "ConditionProfile",
"targetAssessment": [
"https://credentialengineregistry.org/resources/ce-890",
{
"type": "AssessmentProfile"
},
{
"type": "AssessmentProfile"
}
]
}
}
The canonical @id of the credential therefore is:
https://credentialengineregistry.org/resources/ce-123
Interesting, but it breaks 2 things:
public class Credential
{
public string Type { get; set; }
public string Id { get; set; }
public List<ConditionProfile> Requires { get; set; }
}
public class ConditionProfile
{
public List
public class BlankNode { public string Type { get; set; } public string Id { get; set; } }
I can compensate for differences in property names via standard JSON libraries, but what I can't do is take data like:
[ "https://credentialengineregistry.org/resources/ce-890", { "id": ":b1", "type": "ceterms:AssessmentProfile" }, { "id": ":b2", "type": "ceterms:AssessmentProfile" } ]
and deserialize it into a `List<string>`, since the last two objects are not `string`s. I would instead have to make the `TargetAssessment` property into a `List<dynamic>` or something along those lines, and loop through/parse every single one to figure out what to do with it - which in turn means I also have to maintain a second copy of my Credential class with a normalized definition of `TargetAssessment` that I can map everything to.
In other words, I have to deserialize to a middle ground class, then inspect my way through it (recursively, in a real-world scenario), mapping to and populating a "real" class hierarchy that I can then work with elsewhere. This overhead applies to every single property in CTDL that points to a top-level object that may or may not be published to the registry. it's also then doubled, since you have to accommodate going back the other way for publishing.
You can start to see why I would _really_ like to avoid this headache (and why I wouldn't want to put other developers through it). It also makes for a more brittle data structure that can't handle schema changes as easily. I had a similar justification for language maps, but we're stuck with those.
Anyway, in the C# structure above, there would be some other class or property to accommodate the `@graph` (most likely just a class that extends `List<dynamic>`).
There are JSON libraries that make working with this kind of problem a little easier, but ultimately you still have to deal with different data types in the same list and that just isn't something a strictly-typed language is going to do easily (I don't even like doing it in javascript, to be honest, since you have to check every value's type before you can use it - so I may be a bit biased).
Having said all of that, maybe the answer in this case would be for me as the developer to just take any instance of a `/resources/` URI I come across and switch it to a `/graph/` URI, retrieve that, and parse that instead. Hm..
Preprocessing takes care of that most of the time. Having custom deserialization that deserializes from a string or json object by either storing the string in a common ancestor class and lazily fetching and deserializing the resource on access or fetching the resource and then deserializing that should deal with that complexity.
But this is an incredibly common problem among strongly typed languages, one that C# and Java wish they could be the leaders of so that everything could be strongly typed, but its JSON... Javascript Object Notation, so typeless is the name of the game. SOAP tried and failed, so we're now left with our square peg and round hole.
I tend to sidestep this in strongly typed languages by encouraging the use of libraries that already solve this problem... if they exist.
Figure I may as well throw my two cents in on the original question while I'm at it:
Resolving
https://credentialengineregistry.org/graph/ce-123
returns:
Everything inside the decoded_payload
, verbatim. You asked for the graph by name, and that is what you get.
Resolving
https://credentialengineregistry.org/resources/ce-123
returns:
Everything inside the decoded_payload
, almost verbatim. In my opinion, the bnodes are part of the credential's data, since the credential's data is incomplete without them (the credential's data also references bnode IDs, which means you have a problem if you don't include the bnodes). Therefore you need a @graph
wrapper around it, even if you think of it as a @graph
that was generated on-demand and just coincidentally happens to be identical to the named @graph
. The @context
applied to everything in the original data, so it is correct to apply it to everything in the generated @graph
, too. Therefore:
{
"@context": "http://credreg.net/ctdl/schema/context/json",
"@graph": [
{
"@type": "Credential",
"@id": "https://credentialengineregistry.org/resources/ce-123",
"ceterms:requires": [
{
"@type": "ceterms:ConditionProfile",
"ceterms:targetAssessment": [
"https://credentialengineregistry.org/resources/ce-890",
"_:ABC",
"_:DEF"
]
}
]
},
{
"@id": "_:ABC",
"@type": "ceterms:AssessmentProfile"
},
{
"@id": "_:DEF"
"@type": "ceterms:AssessmentProfile"
}
]
}
The only real difference is that the @id
for the @graph
was removed, because in this case, technically (or rather, semantically), the @graph
was anonymously generated as a wrapper to handle the bnodes (it would still be generated even if there were no bnodes in order to ensure consistency in returned data). Whether this is actually what happens at the code level, or if the @id
is just stripped out, is probably irrelevant.
Since the data is identical either way, it may be technically correct to include the @id
for the @graph
; I don't know. I don't care enough to argue about whether or not that should be included; it isn't worth holding up the rest of our implementation. As long as I get back one consistent format that doesn't require a bunch of edge case handling, I'm happy.
The canonical @id of the credential therefore is:
https://credentialengineregistry.org/resources/ce-123
At the end of the day I just kind of see them as two separate URIs for the same data. You could give me back the same document as a result of either URI and I'd be fine with it, but that's just my opinion.
At the end of the day I just kind of see them as two separate URIs for the same data. You could give me back the same document as a result of either URI and I'd be fine with it, but that's just my opinion.
Yup. Take what I gave and do a JSON-LD Flatten on it, and you get a graph back out. It should be the same graph. (with maybe different bnode IDs and a missing @id
) That makes sense considering /resource/
and /graph/
are different web service invocations. The result of /graph/
's @id should invoke the /graph/
service and the result of the /resource/
's @id should invoke /resource/
.
It's just a different shape for the same data (to the RDF aware).
@Lomilar Agreed, the complexity can be dealt with and worked around, but rather than require everyone who publishes or consumes to implement complexity handling, I think it's better to just keep the data simple to begin with. Then nobody has to translate our schema into something they can understand; they can just work with it out of the box.
My sense is that if our data is so inconsistent or confusing that it always (or nearly always) needs to be heavily preprocessed, dramatically transformed, and/or run through a lengthy decision tree before it can be understood by anyone, then we have done something wrong.
Just my opinion, though.
The war of inconsistent flexibility vs consistent complexity is a tough one. I think we've found the fence that separates us.
Either way, high degrees of adoption require code libraries anyway to transfer whatever is done into the native paradigm of the language, so it may matter a little bit less anyway.
I prefer consistent simplicity (with flexibility as more of an extension rather than a foundation), myself, but that is tough to come by sometimes - anyway, I digress. I look forward to Stuart's take on the question.
Other than that, I think we're all on the same page as far as the rest of it goes, so does anyone see a reason why we shouldn't move forward with our various implementations based on:
@graph
at the root of the decoded_payload
@graph
will contain the root object and related objects, published togetherGuys, these skeleton records don't cut it for me. See what you get with these nodeID blank nodes when you add more data than just the @id
and @type
--like add name
etc. Don't stop with not getting any errors as jsonld! Run it through your tests with jsonld playground AND translate them into turtle
and rdf/xml
with something like the Good Relations translator (http://rdf-translator.appspot.com/) or easy rdf (http://www.easyrdf.org/converter) AND the W3C RDF validator with any resulting rdf/xml (if you get so far as translating json-ld to rdf/xml).
What happens to that additional bnode data when you look at it in jsonld playground?
I modified the context, so the above examples don't work as is.
The below validates using the easyrdf converter and the W3C RDF validator. The bnode id
s are replaced when looking at the data in triples, which verifies my concern that bnode id
s aren't respected (and are regenerated at will)
{
"@context": {"actionStat":"http://purl.org/ctdl/vocabs/actionStat/","agentSector":"http://purl.org/ctdl/vocabs/agentSector/","asn":"http://purl.org/ASN/schema/core/","assessMethod":"http://purl.org/ctdl/vocabs/assessMethod/","assessUse":"http://purl.org/ctdl/vocabs/assessUse/","audience":"http://purl.org/ctdl/vocabs/audience/","audLevel":"http://purl.org/ctdl/vocabs/audLevel/","ceterms":"http://purl.org/ctdl/terms/","@vocab":"http://purl.org/ctdl/terms/","claimType":"http://purl.org/ctdl/vocabs/claimType/","costType":"http://purl.org/ctdl/vocabs/costType/","credentialStat":"http://purl.org/ctdl/vocabs/credentialStat/","creditUnit":"http://purl.org/ctdl/vocabs/creditUnit/","dc":"http://purl.org/dc/elements/1.1/","dct":"http://purl.org/dc/terms/","deliveryType":"http://purl.org/ctdl/vocabs/deliveryType/","foaf":"http://xmlns.com/foaf/0.1/","inputType":"http://purl.org/ctdl/vocabs/inputType/","learnMethod":"http://purl.org/ctdl/vocabs/learnMethod/","lrmi":"http://purl.org/dcx/lrmi-terms/","meta":"http://credreg.net/meta/terms/","obi":"https://w3id.org/openbadges#","orgType":"http://purl.org/ctdl/vocabs/orgType/","owl":"http://www.w3.org/2002/07/owl#","purpose":"http://purl.org/ctld/vocabs/purpose/","rdf":"http://www.w3.org/1999/02/22-rdf-syntax-ns#","rdfs":"http://www.w3.org/2000/01/rdf-schema#","residency":"http://purl.org/ctdl/vocabs/residency/","schema":"http://schema.org/","score":"http://purl.org/ctdl/vocabs/score/","serviceType":"http://purl.org/ctdl/vocabs/serviceType/","skos":"http://www.w3.org/2004/02/skos/core#","vann":"http://purl.org/vocab/vann/","vs":"https://www.w3.org/2003/06/sw-vocab-status/ns","xsd":"http://www.w3.org/2001/XMLSchema#","ceterms:addressCountry":{"@container":"@language"},"ceterms:addressLocality":{"@container":"@language"},"ceterms:addressRegion":{"@container":"@language"},"ceterms:agentPurpose":{"@type":"@id"},"ceterms:agentPurposeDescription":{"@container":"@language"},"ceterms:alternateName":{"@container":"@language"},"ceterms:assessmentExample":{"@type":"@id"},"ceterms:assessmentExampleDescription":{"@container":"@language"},"ceterms:assessmentOutput":{"@container":"@language"},"ceterms:availabilityListing":{"@type":"@id"},"ceterms:availableOnlineAt":{"@type":"@id"},"ceterms:commonConditions":{"@type":"@id"},"ceterms:commonCosts":{"@type":"@id"},"ceterms:condition":{"@container":"@language"},"ceterms:contactOption":{"@container":"@language"},"ceterms:contactType":{"@container":"@language"},"ceterms:costDetails":{"@type":"@id"},"ceterms:creditHourType":{"@container":"@language"},"ceterms:creditUnitTypeDescription":{"@container":"@language"},"ceterms:deliveryTypeDescription":{"@container":"@language"},"ceterms:demographicInformation":{"@container":"@language"},"ceterms:description":{"@container":"@language"},"ceterms:evidenceOfAction":{"@type":"@id"},"ceterms:experience":{"@container":"@language"},"ceterms:externalResearch":{"@type":"@id"},"ceterms:familyName":{"@container":"@language"},"ceterms:framework":{"@type":"@id"},"ceterms:frameworkName":{"@container":"@language"},"ceterms:geoURI":{"@type":"@id"},"ceterms:givenName":{"@container":"@language"},"ceterms:hasConditionManifest":{"@type":"@id"},"ceterms:hasCostManifest":{"@type":"@id"},"ceterms:honorificSuffix":{"@container":"@language"},"ceterms:identifierType":{"@container":"@language"},"ceterms:image":{"@type":"@id"},"ceterms:isSimilarTo":{"@type":"@id"},"ceterms:keyword":{"@container":"@language"},"ceterms:missionAndGoalsStatement":{"@type":"@id"},"ceterms:missionAndGoalsStatementDescription":{"@container":"@language"},"ceterms:name":{"@container":"@language"},"ceterms:paymentPattern":{"@container":"@language"},"ceterms:processFrequency":{"@container":"@language"},"ceterms:processMethod":{"@type":"@id"},"ceterms:processMethodDescription":{"@container":"@language"},"ceterms:processStandards":{"@type":"@id"},"ceterms:processStandardsDescription":{"@container":"@language"},"ceterms:revocationCriteria":{"@type":"@id"},"ceterms:revocationCriteriaDescription":{"@container":"@language"},"ceterms:sameAs":{"@type":"@id"},"ceterms:scoringMethodDescription":{"@container":"@language"},"ceterms:scoringMethodExample":{"@type":"@id"},"ceterms:scoringMethodExampleDescription":{"@container":"@language"},"ceterms:socialMedia":{"@type":"@id"},"ceterms:source":{"@type":"@id"},"ceterms:streetAddress":{"@container":"@language"},"ceterms:subjectWebpage":{"@type":"@id"},"ceterms:submissionOf":{"@container":"@language"},"ceterms:targetNode":{"@type":"@id"},"ceterms:targetNodeDescription":{"@container":"@language"},"ceterms:targetNodeName":{"@container":"@language"},"ceterms:taskDetails":{"@type":"@id"},"ceterms:url":{"@type":"@id"},"ceterms:verificationDirectory":{"@type":"@id"},"ceterms:verificationMethodDescription":{"@container":"@language"},"ceterms:verificationService":{"@type":"@id"},"meta:domainFor":{"@type":"@id"},"meta:hasConcept":{"@type":"@id"},"meta:moreInformation":{"@type":"@id"},"meta:objectText":{"@container":"@language"},"meta:supersededBy":{"@type":"@id"},"meta:targetScheme":{"@type":"@id"},"rdfs:subclassOf":{"@type":"@id"},"owl:equivalentProperty":{"@type":"@id"},"owl:equivalentClass":{"@type":"@id"},"schema:domainIncludes":{"@type":"@id"},"schema:rangeIncludes":{"@type":"@id"},"owl:inverseOf":{"@type":"@id"},"skos:broader":{"@type":"@id"},"skos:narrower":{"@type":"@id"},"skos:inScheme":{"@type":"@id"},"vs:term_status":{"@type":"@id"},"skos:changeNote":{"@type":"@id"},"rdfs:label":{"@container":"@language"},"rdfs:comment":{"@container":"@language"},"dct:description":{"@container":"@language"},"vann:usageNote":{"@container":"@language"},"skos:prefLabel":{"@container":"@language"},"skos:definition":{"@container":"@language"},"id":{"@id":"@id"},"type":{"@id":"@type"},"Credential":{"@id":"ceterms:Credential"},"ConditionProfile":{"@id":"ceterms:Credential"},"targetAssessment":{"@id":"ceterms:targetAssessment","@type":"@id"}},
"id": "https://credentialengineregistry.org/resources/ce-123",
"type": "Credential",
"name":"The credential",
"requires": {
"id":"_:bnode0",
"type": "ConditionProfile",
"targetAssessment": [
"https://credentialengineregistry.org/resources/ce-890",
{
"id":"_:bnode1",
"type": "AssessmentProfile",
"name":"The first assessment profile"
},
{
"id":"_:bnode2",
"type": "AssessmentProfile",
"name":"The second assessment profile"
}
],
"name":"The condition profile"
}
}
I've been discussing this implementation with @cwd-mparsons and we have a question:
Is there any reason not to include the ceterms:ctid
at the @graph
level? This should:
Discussion of #508 led to uncovering deeper issues with our data design as it relates to JSON-LD, the Registry, CASS, signatures, etc. I will attempt to document this as clearly as possible. We need to align all of our systems to be able to handle the following:
Situation
CTDL
CTDL-ASN
/resources/[CTID]/[UUID]
?Concept Schemes (CTDL-SKOS?)
Multiple Languages
@language
in the@context
, having one language per JSON document (this will be the majority of cases) and requiring additional documents to be published for each additional language that describes a given thingJSON Validation
Credential Registry
/resources/[CTID]
/vocabs/conceptName/concept
CASS
Problems and Proposals
Currently, the Registry structure:
@graph
array where one node is the current payload, and subsequent nodes are the blank nodes@graph
s@graph
as long as they are somewhere in the JSON document? Is that valid JSON-LD?/resources/[CTID for framework]/[UUID for competency]
approach noted above/vocabs/[Concept Scheme URI]/[Concept URI]
, e.g./vocabs/costType/TechnologyFee
Currently, CASS:
So, we have a complex and interwoven web of issues where solutions to one will influence (if not outright determine/block) solutions to others. I am not sure of the best way to handle this short of proposing and walking through entire solution stack proposals - but maybe that would be worth doing?
I think this can all be handled with one model or set of rules for modeling data - but we all must be on the same page about that solution and how it impacts (or is impacted by) all of our more localized use cases/issues/etc.
Flagging down @stuartasutton @science @lomilar @cwd-mparsons to get their thoughts (though I have discussed this with Mike some internally).