CredentialEngine / Schema-Development

Development of the vocabularies for the CTI models
14 stars 8 forks source link

Data Design Issue #521

Closed siuc-nate closed 5 years ago

siuc-nate commented 6 years ago

Discussion of #508 led to uncovering deeper issues with our data design as it relates to JSON-LD, the Registry, CASS, signatures, etc. I will attempt to document this as clearly as possible. We need to align all of our systems to be able to handle the following:

Situation

CTDL

CTDL-ASN

Concept Schemes (CTDL-SKOS?)

Multiple Languages

JSON Validation

Credential Registry

CASS

Problems and Proposals

Currently, the Registry structure:

Currently, CASS:

So, we have a complex and interwoven web of issues where solutions to one will influence (if not outright determine/block) solutions to others. I am not sure of the best way to handle this short of proposing and walking through entire solution stack proposals - but maybe that would be worth doing?

I think this can all be handled with one model or set of rules for modeling data - but we all must be on the same page about that solution and how it impacts (or is impacted by) all of our more localized use cases/issues/etc.

Flagging down @stuartasutton @science @lomilar @cwd-mparsons to get their thoughts (though I have discussed this with Mike some internally).

stuartasutton commented 6 years ago

Must we, really? If it is needed, can't we simply assign a CTID-based URI and the local system software grab it from the URI if it needs it for local machinations?

On Thu, Apr 5, 2018 at 4:15 PM, siuc-nate notifications@github.com wrote:

I've been discussing this implementation with @cwd-mparsons https://github.com/cwd-mparsons and we have a question:

Is there any reason not to include the ceterms:ctid at the @graph level? This should:

  • Make it possible to use existing Registry software to retrieve records by CTID (since it will be at the root level of the payload)
  • Hopefully help work around the URI issue we're wrestling with

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CredentialEngine/vocabularies/issues/521#issuecomment-379062774, or mute the thread https://github.com/notifications/unsubscribe-auth/ACzYpgpVZBQ70ZB4BMNyEPUjKHo8pc7Jks5tlnt7gaJpZM4Sm7Ig .

-- Stuart A. Sutton, Metadata Consultant Associate Professor Emeritus, University of Washington Information School Email: stuartasutton@gmail.com Skype: sasutton

siuc-nate commented 6 years ago

Correct me if I'm wrong, @cwd-mparsons , but I think the need for that hinges on whether or not the Registry depends on the CTID existing in the root level of the payload.

siuc-nate commented 6 years ago

@cwd-mparsons @stuartasutton @Lomilar I have updated the credreg.net site to:

Please verify that these context files now contain everything they should: http://credreg.net/ctdl/schema/context/json http://credreg.net/ctdlasn/schema/context/json

Note: The current code that generates the JSON Schema Validation documents has not been updated to reflect the use of @graph, because:

cwd-mparsons commented 6 years ago

@stuartasutton @siuc-nate @Lomilar I have tested:

  1. Publishing just the graph - no external Ctid, and no type
  2. Publishing with a Ctid at same level as the graph, and a type

For option 1, the document can only be retrieved by envelope, and not CTID https://sandbox.credentialengineregistry.org/envelopes/b86629e9-87e5-4db0-ac43-38a85d321b79 Also, will not be found by a search by competency_framework https://sandbox.credentialengineregistry.org/ce-registry/search?resource_type=competency_framework

For option 2, the document can be retrieved by ctid, and can be found in the search. https://sandbox.credentialengineregistry.org/resources/ce-e15c3347-e7cd-367d-9408-0ae9a595e4fb

I had encountered a strange error where if the Ctid at the graph level is different than that for the competency framework, I get: Not enough or too many segments The latter seems a very strange error, but could be related to something in the registry. I have been testing publishing with validation turned off, so the error should not be schema related.

We have asked the registry team to investigate the implications of the Ctid only being inside the graph.

I think that for the specific case of publishing to the registry, we should include a CTID (the same as that for the competency framework) at the same level as the graph, along with the type.

siuc-nate commented 6 years ago

@stuartasutton @Lomilar @cwd-mparsons Where are we at on this? @stuartasutton did you get a chance to look at the context file changes in my above comment?

We met with the Credential Registry team a little while ago - they should be able to handle the changes but are discussing things internally (last I heard).

Lomilar commented 6 years ago

I haven't taken any action and don't have much of an opinion, since CTID is an internal identifier.

I believe that I only object to the @id of the graph being the same as the @id of the framework.

stuartasutton commented 6 years ago

I'm with @Lomilar that the @id for the graph being a unique URI of the form https://credentialengineregistry.org/graph/[UUID] and NOT being the same as the "top-level" entity in graph (however, not wed to the CTID form with this URI). I'll leave whether the graph itself should also have a CTID up to you guys. I've already complained too much about the CTIDs.

siuc-nate commented 6 years ago

I would prefer to have a URI for the graph that ends in the same CTID as the "main" resource in the graph itself, so that it's easy to figure out one URI or the other if you know the CTID. That would simplify documentation, implementation, and allow for advice along the lines of "To get all of the relevant data for this resource, use the /graph/ endpoint with the resource's CTID" (with a bit of additional explanation that it needs to be the CTID of the framework for competency framework graphs).

science commented 6 years ago

This may seem like an impossibly basic or ignorant question (forgiveness in advance, requested).

Are we talking about publishing envelope changes or resultset data changes? I see some mention of both above. That is, are we returning \@ graph structures as results, or are we allowing orgs to publish \@ graph statements?

If the latter is considered (and I think it is), I'm a little worried about republishing the same entity again and again - for example publishing multiple credentials with the same competencies would result in the same competency published multiple times? (Or consider the same question with organizations and credentials).

Am I missing a critical part of this conversation? Thanks for any education and enlightenment. (I tried to escape \@ graph so it wouldn't hassle the \@ graph user but that apparently failed - sorry graph, but I'd guess they're used to it)

siuc-nate commented 6 years ago

Per @stuartasutton, to summarize so far:

  1. We will implement language maps as originally planned
  2. We will implement @graph at the root of the decoded_payload
  3. We will implement blank nodes in the @graph
  4. We will implement Competency Frameworks and Competencies in the same @graph
  5. We will implement an @id for the @graph using a URI that has /graph/ instead of /resources/ (see below)
  6. The /graph/ URI will share the same CTID as the "primary" resource within the graph, e.g. https://credentialengineregistry.org/graph/ce-b69aa3a7-3f58-442f-9539-291ea29cc958 and https://credentialengineregistry.org/resources/ce-b69aa3a7-3f58-442f-9539-291ea29cc958 (see below)
  7. We will encourage using the /graph/ URI as opposed to the /resources/ URI
  8. Retrieving something via its /resources/ URI will return just that resource (and associated @context), even if the resource references other nodes (even blank nodes)

Example Source Data:

{
  "envelope_id": "04ca4351-47d8-4bc5-ad2e-11704ee99277",
  "decoded_payload": {
    "@context": "http://credreg.net/ctdl/schema/context/json",
    "@id": "https://credentialengineregistry.org/graph/ce-b69aa3a7-3f58-442f-9539-291ea29cc958",
    "@graph": [
      {
        "@id": "https://credentialengineregistry.org/resources/ce-b69aa3a7-3f58-442f-9539-291ea29cc958"
        "@type": "ceterms:Certification",
        "ceterms:name": {
          "en-US": "My Credential Name"
        },
        "ceterms:requires": [
          {
            "ceterms:targetAssessment": [
              "https://credentialengineregistry.org/graph/ce-317bbd77-4375-4434-bcf4-1effc3398ed6",
              "_:bfb140c3-8b62-4d9a-a2f4-c2ce8cf65054"
            ]
          }
        ]
      },
      {
        "@id": "_:bfb140c3-8b62-4d9a-a2f4-c2ce8cf65054",
        "ceterms:name": {
          "en-US": "My referenced assessment"
        }
      }
    ]
  }
}

If you resolve https://credentialengineregistry.org/graph/ce-b69aa3a7-3f58-442f-9539-291ea29cc958:

{
  "@context": "http://credreg.net/ctdl/schema/context/json",
  "@id": "https://credentialengineregistry.org/graph/ce-b69aa3a7-3f58-442f-9539-291ea29cc958",
  "@graph": [
    {
      "@id": "https://credentialengineregistry.org/resources/ce-b69aa3a7-3f58-442f-9539-291ea29cc958"
      "@type": "ceterms:Certification",
      "ceterms:name": {
        "en-US": "My Credential Name"
      },
      "ceterms:requires": [
        {
          "ceterms:targetAssessment": [
            "https://credentialengineregistry.org/graph/ce-317bbd77-4375-4434-bcf4-1effc3398ed6",
            "_:bfb140c3-8b62-4d9a-a2f4-c2ce8cf65054"
          ]
        }
      ]
    },
    {
      "@id": "_:bfb140c3-8b62-4d9a-a2f4-c2ce8cf65054",
      "ceterms:name": {
        "en-US": "My referenced assessment"
      }
    }
  ]
}

If you resolve https://credentialengineregistry.org/resources/ce-b69aa3a7-3f58-442f-9539-291ea29cc958:

{
  "@context": "http://credreg.net/ctdl/schema/context/json",
  "@id": "https://credentialengineregistry.org/resources/ce-b69aa3a7-3f58-442f-9539-291ea29cc958"
  "@type": "ceterms:Certification",
  "ceterms:name": {
    "en-US": "My Credential Name"
  },
  "ceterms:requires": [
    {
      "ceterms:targetAssessment": [
        "https://credentialengineregistry.org/graph/ce-317bbd77-4375-4434-bcf4-1effc3398ed6",
        "_:bfb140c3-8b62-4d9a-a2f4-c2ce8cf65054"
      ]
    }
  ]
}
Lomilar commented 6 years ago

Thank you Nate for this summary. I saw this request and my brain broke trying to remember everything.

+1

siuc-nate commented 6 years ago

To summarize: This issue is basically solved, but we're keeping it open for now as a reference for implementation.

jeannekitchens commented 6 years ago

@stuartasutton @Lomilar @siuc-nate @cwd-mparsons @science we need to meet in June and finalize the data design and put a deadline on the related work.

siuc-nate commented 6 years ago

I have created a google document to describe the implementation details: https://docs.google.com/document/d/1rCEEMD4eKPpVPANsz_zOOPQ70otJzuFEecMUtzynHrc

jeff-grann commented 6 years ago

This Digital Competence framework for citizens (DigComp) could be used to illustrate how multiple languages are supported.

siuc-nate commented 5 years ago

Per our 4-9-2019 meeting: Closing this issue (finally!) as it has been implemented across our system.