cdisc-org / cdisc-rules-engine

Open source offering of the cdisc rules engine
MIT License
46 stars 12 forks source link

Implement schema-based validation for USDM JSON data #715

Open ASL-rmarshall opened 4 months ago

ASL-rmarshall commented 4 months ago

The DDF team has identified some generic checking requirements which might be best implemented through the checking of USDM JSON data against a defined USDM schema (e.g., JSON-Schema). These checks include confirming conformance with characteristics defined in the USDM API specification, including such things as :

ASL-rmarshall commented 4 months ago

This was the proposal I made at the beginning of April: USDM Schema Validation.pptx image

Given that schemas in API specs based on OpenAPI v3.1 or higher are JSON-Schema compliant (and the USDM v3.0 API spec is OpenAPI v3.1), I have shown that it's possible to extract a usable USDM JSON-Schema definition directly from the USDM API spec, pass it into the JSON-Schema validator currently used by the CORE Engine to validate Dataset-JSON files (jsonschema), and use it to generate a list of schema violations found in a USDM JSON data file (e.g., CDISC_Pilot_Study_sv_edit-2024-05-16T13-19-27.xlsx).

The following would be needed from the CORE team to implement this proposal:

ASL-rmarshall commented 4 months ago
This is a table of the JSON-Schema attributes that are handled by the jsonschema validator: Schema Attribute Message(s)
$dynamicRef
$ref
additionalItems Additional items are not allowed (%s %s unexpected)
additionalProperties {joined} {verb} not match any of the regexes: {patternProperties}
Additional properties are not allowed (%s %s unexpected)
allOf
anyOf {instance!r} is not valid under any of the given schemas
const {const!r} was expected
contains {instance!r} does not contain items matching the given schema
dependentRequired {each!r} is a dependency of {property!r}
dependentSchemas
enum {instance!r} is not one of {enums!r}
exclusiveMaximum {instance!r} is greater than or equal to the maximum of {maximum!r}
exclusiveMinimum {instance!r} is less than or equal to the minimum of {minimum!r}
format
if
items Expected at most {prefix} items, but found {total}
maxItems {instance!r} is too long
maxLength {instance!r} is too long
maxProperties {instance!r} has too many properties
maximum {instance!r} is greater than the maximum of {maximum!r}
minItems {instance!r} is too short
minLength {instance!r} is too short
minProperties {instance!r} does not have enough properties
minimum {instance!r} is less than the minimum of {minimum!r}
multipleOf {instance!r} is not a multiple of {dB}
not {instance!r} should not be valid under {not_schema!r}
oneOf {instance!r} is not valid under any of the given schemas
{instance!r} is valid under each of {reprs}
pattern {instance!r} does not match {patrn!r}
patternProperties
prefixItems
properties
propertyNames
required {property!r} is a required property
type {instance!r} is not of type {reprs}
unevaluatedItems Unevaluated items are not allowed (%s %s unexpected)
unevaluatedProperties Unevaluated properties are not allowed (%s %s unexpected)
Unevaluated properties are not valid under the given schema (%s %s unevaluated and invalid)
uniqueItems {instance!r} has non-unique elements