learningtapestry / metadataregistry

DEPRECATED - THIS CODE BASE IS NO LONGER MAINTAINED. Metadata Registry
Apache License 2.0
7 stars 4 forks source link

Process for updating and testing the new ctdl json schema #87

Open siuccwd opened 7 years ago

siuccwd commented 7 years ago

@andersoncardoso , @science As noted previously, the updated schema (http://credreg.net/ctdl/Schema/20161117/validation/json is quite different from that currently in the registry. It is currently generated from the properties spreadsheet at : https://docs.google.com/spreadsheets/d/1Bw94UrAWwno9gFg7yOZjnTGq9XUlsyYrgEi_FwgZ_tY/edit#gid=0 The previous version of the spreadsheet is a little easier to read at: https://docs.google.com/spreadsheets/d/1Bw94UrAWwno9gFg7yOZjnTGq9XUlsyYrgEi_FwgZ_tY/edit#gid=0

As noted the schema is generated, and so is in one file. I would think that it would be more difficult to work with, and that we should split it into multiple files. I have already made this request to @siuc-nate. @andersoncardoso would you agree?

There are lots of classes that are referenced from the 'top' level classes. So rather than duplicating these, should these all be placed in a separate 'common' schema file? We would then have at least:

Will this make sense, or would additional common files be preferred, perhaps to ease management (with smaller files), or to logically group related classes?

I see that there already an external schema file called json_ld, that has some common definitions. It doesn't appear that these are used yet by the ce registry. This looks like it is intended to use with graphs and probably would not be modified at all related to the ce-registry.

Testing I have tried to use the site: http://www.jsonschemavalidator.net/ to test modifying a schema, and testing. I have not been able to properly get expected errors. Following is the schema (abbreviated) and test json.

Schema Test file { "$schema": "http://json-schema.org/draft-04/schema#", "description": "CE/Registry Credential metadata", "type": "object", "definitions": { "ctdl": { "properties": { "@type": { "enum": [ "ctdl:Credential", "ctdl:Badge", "ctdl:DigitalBadge", "ctdl:OpenBadge", "ctdl:Certificate", "ctdl:ApprenticeshipCertificate", "ctdl:JourneymanCertificate", "ctdl:MasterCertificate", "ctdl:Certification", "ctdl:Degree", "ctdl:GeneralEducationDevelopment", "ctdl:SecondarySchoolDiploma", "ctdl:License", "ctdl:MicroCredential", "ctdl:QualityAssuranceCredential" ] }, "ctdl:accreditedBy": { "description": "An agent that accredits the described resource.", "enum": [ "ctdl:Organization", "ctdl:CredentialOrganization", "ctdl:Person" ], "error": "Must be one of the values: 'ctdl:Organization', 'ctdl:CredentialOrganization', 'ctdl:Person'" }, "ctdl:ctid": { "description": "A globally unique Credential Transparency Identifier (CTID) issued by the Credential Registry Service (CRS) by which the creator/owner/provider of a credential recognizes the credential in transactions with the external environment (e.g., in verifiable claims involving the credential).", "type": "string", "pattern": "^urn:ctid:.*" },

            "ctdl:name": {
                "description": "The name of the resource being described.",
                "type": "string"
            }
        },
        "required": [
            "@type",
            "ctdl:ctid",
            "ctdl:name",
            "ctdl:url"
        ]
    }
}

}

Test Json I tried forcing required property errors by using:

{ "@context": { "schema": "http://schema.org/", "dc": "http://purl.org/dc/elements/1.1/", "dct": "http://dublincore.org/terms/", "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#", "rdfs": "http://www.w3.org/2000/01/rdf-schema#", "ctdl": "[CTI Namespace Not Determined Yet]" }, "@type": "ctdl:Credential", "ctdl:credentialLevel": [ "Postsecondary (Less than 1 year)", "Postsecondary (1-3 years)", "Postsecondary (3-6 years)", "Postsecondary (6+ years)" ], "ctdl:credentialType": [ "Badge" ], "ctdl:ctid": "urnX:ctid:25843D95-13E6-4CF8-B3F3-E4B9DBBEDB10", "ctdl:dateEffective": "2016-05-04T00:00:00", "ctdl:estimatedTimeToEarn": [ ], "ctdl:description": "This is the description of the credential.\n0710a", "ctdl:image": "http://myimageurl.com", "ctdl:nameX": "zzz Credential Name Example- 0823a", "schema:url": "http://credreg.net/0710a" }

NewtonSoft has a package for testing json. validation: NewtonSoft.Json.Schema. I have downloaded it done some testing. I have been able to get errors but not to the level that I had hoped. I want to ensure that I am not wasting time by not using the right tools. So:

andersoncardoso commented 7 years ago

Hi, probably it's better to use small files for each sub-schema to be reused. This can be achieve by referencing the definitions by url, but I have to check how we can generate new schemas for a community on the plataform.

About the validation problems. You have a little mistake on the schema. notice you defined the ctdl schema on the "definitions", but those are never used unless you point to them (using the json pointer notation "#/definitinos/name_of_the_defintion").

For example:

{
  "context": { ... }
  "definitions": {
     "definition_A": { .... },
     "definition_B": { .... },
    "definition_C": { .... }
  }, 

  allOf: [
     {"$ref": "#/definitions/definition_A"}, 
     {"$ref": "#/definitions/defintion_B} 
  ]   // notice that in this case "C" is never used, so it wont be used for validation. 
}

you don't have an "allOf" clause (there is others, like "anyOf", etc). just adding this: "allOf": [{"$ref": "#/definitions/ctdl"}] should do the trick.

If you don't want to split your schema into definitions you can just to this way:

{
  "context": {...}
  "properties": { ... place properties directly here, instead using a definitions pointer }
}

After fixing the schema, I tested with http://www.jsonschemavalidator.net/ and it shows the errors properly :

 Found 3 error(s)

Message:
    JSON does not match all schemas from 'allOf'. Invalid schema indexes: 0.
Schema path:
    #/allOf

Message:
    String 'urnX:ctid:25843D95-13E6-4CF8-B3F3-E4B9DBBEDB10' does not match regex pattern '^urn:ctid:.*'.
Schema path:
    #/definitions/ctdl/properties/ctdl:ctid/pattern

Message:
    Required properties are missing from object: ctdl:name, ctdl:url.
Schema path:
    #/definitions/ctdl/required

I hope this helps. best regards

Anderson

siuccwd commented 7 years ago

@andersoncardoso Thanks, those are some helpful hints. Now I have a question about : "additionalProperties": false We don't want to allow properties that are not part of the schema (when doing the strictest validating ). I tried adding the latter, and get an error: Message: Property '@context' has not been defined and the schema does not allow additional properties.

The json always has acontext block, like the following: "@context": { "schema": "http://schema.org/", "dc": "http://purl.org/dc/elements/1.1/", "dct": "http://dublincore.org/terms/", "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#", "rdfs": "http://www.w3.org/2000/01/rdf-schema#", "ctdl": "[CTI Namespace Not Determined Yet]" },

Is a different syntax necessary to make the validator accept @ context , or is the placement of the additionalProperties incorrect. I had it at the end of the definitions:

{ "$schema": "http://json-schema.org/draft-04/schema#", "description": "CE/Registry Credential metadata", "type": "object", "definitions": { "ctdl": { "properties": { "@type": { "enum": [ "ceterms:Credential", "ceterms:Badge", "ceterms:DigitalBadge", "ceterms:OpenBadge", "ceterms:Certificate", "ceterms:ApprenticeshipCertificate", "ceterms:JourneymanCertificate", "ceterms:MasterCertificate", "ceterms:Certification", "ceterms:Degree", "ceterms:GeneralEducationDevelopment", "ceterms:SecondarySchoolDiploma", "ceterms:License", "ceterms:MicroCredential", "ceterms:QualityAssuranceCredential" ] }, "ceterms:accreditedBy": { "description": "An agent that accredits the described resource.", "enum": [ "ceterms:Organization", "ceterms:CredentialOrganization", "ceterms:Person" ], "error": "Must be one of the values: 'ceterms:Organization', 'ceterms:CredentialOrganization', 'ceterms:Person'" }, "ceterms:credentialType": { "description": "credentialType.", "enum": [ "Badge", "Certificate", "Diploma" ], "error": "Must be one of the values: 'ceterms:Organization', 'ceterms:CredentialOrganization', 'ceterms:Person'" }, "ceterms:ctid": { "description": "A globally unique Credential Transparency Identifier (CTID) issued by the Credential Registry Service (CRS) by which the creator/owner/provider of a credential recognizes the credential in transactions with the external environment (e.g., in verifiable claims involving the credential).", "type": "string", "pattern": "^urn:ctid:.*" }, "ceterms:name": { "description": "The name of the resource being described.", "type": "string" }, "ceterms:description": { "description": "The description of the resource being described.", "type": "string" }, "ceterms:image": { "description": "The name of the resource being described.", "type": "string" }, "ceterms:dateEffective": { "description": "The effective date of the information.", "type": "string", "format": "date" }, "ceterms:url": { "description": "The name of the resource being described.", "type": "string" }, "ceterms:credentialLevel": { "description": "credentialLevel.", "enum": [ "Postsecondary (Less than 1 year)", "Postsecondary (1-3 years)", "Postsecondary (3-6 years)", "Postsecondary (6+ years)" ] } }, "additionalProperties": false, "required": [ "@type", "ceterms:ctid", "ceterms:name", "ceterms:url" ] }, "additionalProperties": false }, "allOf": [ { "$ref": "#/definitions/ctdl" } ] }

andersoncardoso commented 7 years ago

The behavior is correct, since you are not defining the@context property. For regular json objects, this prop doesn't have anything special, but it does for a json-ld. We usually point to a json-ld definition on the schema. For example:

"allOf": [
    {"$ref": "http://lr-staging.learningtapestry.com/api/schemas/json_ld"},
    {"$ref": "#/definitions/schemaorg_organization"},
    {"$ref": "#/definitions/ctdl"}
]

notice the first $ref to an external definition for json-ld objects. This should validate the @context properly.

Another simple approach would be to define your own expectations around json-ld structures. Since they are very open, you might want to restrict them (always accept the @context as being a list of objects, the @id as an url, etc). For example, just adding this would solve:

"properties": {
   "@context": {"type": "object"},
   "@type": // ...
}
siuccwd commented 7 years ago

@andersoncardoso I tried using: {"$ref": "http://lr-staging.learningtapestry.com/api/schemas/json_ld"}, and was not successful. Adding the @context to properties did resolve the issue. thx

andersoncardoso commented 7 years ago

@siuccwd

Regarding this you can read more on: https://spacetelescope.github.io/understanding-json-schema/reference/combining.html?highlight=ref#allof at the end of this section:

This works, but what if we wanted to restrict the schema so no additional properties are 
allowed?
One might try adding the highlighted line below:
 "additionalProperties": false

Unfortunately, now the schema will reject everything. This is because the Properties refers 
to the entire schema. And that entire schema includes no properties, and knows nothing 
about the properties in the subschemas inside of the allOf array.

This shortcoming is perhaps one of the biggest surprises of the combining operations in 
JSON schema: it does not behave like inheritance in an object-oriented language. 
There are some proposals to address this in the next version of the JSON schema 
specification.

I didn't knew about this caveat too. I guess the way to go is adding these definitions directly to the properties if you wish to enforce "additionalProperties": false