jobdataexchange / Data-Modeling

This repo is intended to contain resources and discussion regarding the JDX data modeling.
Other
8 stars 6 forks source link

Identifying by URI concepts and competencies using schema.org/DefinedTerm #6

Closed stuartasutton closed 5 years ago

stuartasutton commented 5 years ago

First, my thanks to @philbarker for the following scenario and the issue it reveals in expressing value vocabulary terms using the schema.org:DefinedTerm.

Throughout the JDX domain model, there are references to controlled value vocabularies in the form of simple, flat enumerations (e.g., code sets), hierarchical concept schemes and competency frameworks as well as relevant taxonomies. Where a job description references a concept or competency, the schema.org/DefinedTerm type/class will be used to unambiguously identify it. Since it will be used a lot in the JDX json-ld, we want to get it modeled as clearly as possible while conforming to its roots in schema.org.

Example 1: DefinedTerm named with URI of term being described

Phil states:

"Consider a posting for two programmer jobs, one senior, one junior, both require ability https://www.onetonline.org/link/summary/123 , but it's more important for the senior programmer than the junior."

Here is Phil's json-ld code snippet based on the current draft of the domain model:

{
  "@context": [ "http://schema.org/" ,
        {"jdx": "http://jdx.org/terms/"}
    ],
  "@graph": [
    {
        "@id": "http://someRepository.org/jobPostings/1" ,
        "@type": "http://jdxTest.org/terms/JobMaster",
        "title": "Senior Software Engineer" ,
        "jdx:ability": {
            "@type": "DefinedTerm" ,
            "@id": "https://www.onetonline.org/link/summary/123",
            "jdx:termAnnotation": {
                "@type": "jdx:ScaleAnnotation",
                "jdx:scaleType": {
                    "@type": "schema:DefinedTerm",
                    "@id": "https://someScales.org/importance",
                    "name": "Importance",
                    "inDefinedTermSet": "https://someScales.org/"
                },
                "jdx:preferredValue": 0.85 ,
                "jdx:requiredValue": 0.65
            }
        }
    },
    {
      "@id": "http://someRepository.org/jobPostings/2" ,
      "@type": "http://jdxTest.org/terms/JobMaster",
      "title": "Junior Software Engineer" ,
      "jdx:ability": {
          "@type": "DefinedTerm" ,
          "@id": "https://www.onetonline.org/link/summary/123",
          "jdx:termAnnotation": {
              "@type": "jdx:ScaleAnnotation",
              "jdx:scaleType": {
                  "@type": "schema:DefinedTerm",
                  "@id": "https://someScales.org/importance",
                  "name": "Importance",
                  "inDefinedTermSet": "https://someScales.org/"
              },
              "jdx:preferredValue": 0.65 ,
              "jdx:requiredValue": 0.45
          }
        }
     }
  ]
}

This approach with schema.org:DefinedTerm is not unprecedented. See the json-ld example documenting the schema.org/DefinedTermSet class that looks like the following code snippet. NOTE that the openjurist.org URI (@id) naming a term (e.g., "calendar year") is used as the URI (@id) naming the instance of DefinedTerm.

[
        {
                "@context": "http://schema.org/"
        },
        {
                "@type": ["DefinedTermSet","Book"],
                "@id": "http://openjurist.org/dictionary/Ballentine",
                "name": "Ballentine's Law Dictionary"
        },
        {
                "@type": "DefinedTerm",
                "@id": "http://openjurist.org/dictionary/Ballentine/term/calendar-year",
                "name": "calendar year",
                "description": "The period from January 1st to December 31st, inclusive, of any year.",
                "inDefinedTermSet": "http://openjurist.org/dictionary/Ballentine"
        },
        {
                "@type": "DefinedTerm",
                "@id": "http://openjurist.org/dictionary/Ballentine/term/schema",
                "name": "schema",
                "description": "A representation of a plan or theory in the form of an outline or model.",
                "inDefinedTermSet": "http://openjurist.org/dictionary/Ballentine"
        }
]

However, Phil noted a problem when he looked at how his code snippet for this scenario graphs as RDF: definedterm_with_inappropriate_merger Note that because both the "Senior Software Engineer" and the "Junior Software Engineer" are documenting the importance of the same "ability" --https://www.onetonline.org/link/summary/123, there is a merger of the two DefinedTerm instances resulting in a disconnect of the two ScaleAnnotation entities and their respective postings. In sum, we can't tell which ScaleAnnotation goes with which opening--"Senior" or "Junior". In this scenario, the only way to get past this quite natural merger given two things with the same URI is to not identify the instances of DefinedTerm using the URI (i.e. @id) of the term being described...as illustrated in Example 2 below.

Example 2: DefinedTerm as bNode and new "termDefined" property for term URI

In the following json-ld snippet, both DefinedTerm entities have been expressed as bNodes (although that doesn't necessarily have to be the case). Because we want to be able to reference by URI the precise term when such URI are available, I've created a jdx:termDefined property to point in Phil's scenario to "ability" https://www.onetonline.org/link/summary/123. [An argument could be made (by others, not me) that there are other schema.org options instead of coining a new term--such as schema.org/url or schema.org/additionaProperty.]

{
  "@context": [ "http://schema.org/" ,
        {"jdx": "http://jdx.org/terms/"}
    ],
  "@graph": [
    {
        "@id": "http://someRepository.org/jobPostings/1" ,
        "@type": "http://jdxTest.org/terms/JobMaster",
        "title": "Senior Software Engineer" ,
        "jdx:ability": {
            "@type": "DefinedTerm" ,
            "jdx:termDefined": {"@id": "https://www.onetonline.org/link/summary/123"},
            "jdx:termAnnotation": {
                "@type": "jdx:ScaleAnnotation",
                "jdx:scaleType": {
                    "@type": "schema:DefinedTerm",
                    "@id": "https://someScales.org/importance",
                    "name": "Importance",
                    "inDefinedTermSet": "https://someScales.org/"
                },
                "jdx:preferredValue": 0.85 ,
                "jdx:requiredValue": 0.65
            }
        }
    },
    {
      "@id": "http://someRepository.org/jobPostings/2" ,
      "@type": "http://jdxTest.org/terms/JobMaster",
      "title": "Junior Software Engineer" ,
      "jdx:ability": {
          "@type": "DefinedTerm" ,
          "jdx:termDefined": {"@id": "https://www.onetonline.org/link/summary/123"},
          "jdx:termAnnotation": {
              "@type": "jdx:ScaleAnnotation",
              "jdx:scaleType": {
                  "@type": "schema:DefinedTerm",
                  "@id": "https://someScales.org/importance",
                  "name": "Importance",
                  "inDefinedTermSet": "https://someScales.org/"
              },
              "jdx:preferredValue": 0.65 ,
              "jdx:requiredValue": 0.45
          }
        }
     }
  ]
}

definedterm_with_no_merger_with_bnode

Thoughts as to other solutions here that maintain the integrity of the DefinedTerm entity are appreciated.