ropensci / jsonvalidate

:heavy_check_mark::interrobang: Validate JSON
https://docs.ropensci.org/jsonvalidate
Other
48 stars 14 forks source link

Resolve pointers (references) in json being validated prior to validation #64

Open annakrystalli opened 1 year ago

annakrystalli commented 1 year ago

Hello!

We're developing a framework for setting up forecasting hubs, using json configuration files to specify the config for them and are trying to use jsonvalidate to validate config files against schema.

An issue we are having is that the json config files themselves contain references to elements in a $defs section and need to be resolved prior to validation, both because they throw validation errors (see below) but also to ensure the definitions have been correctly specified according to the schema.

At the minute, because we have not found any such functionality in R, we are getting around this in a somewhat hacky way (reading the json config files into R, resolving pointers with a custom function and then reserialising to JSON in order to perform validation with jsonvalidate::json_validate() which we are having a few issues with also (see issue #65 ))

We were wondering if it would be possible to also resolve pointers in the json being validated as well as the schema? I appreciate this might be deemed outside the scope of the package and I'm not sure how much work it would be to implement but given it is an important step prior to validation perhaps it could be considered within scope?

Reproducible example

# CREATE DIR & PATHS -----
# Create temp dir
tmp_dir <- tempdir()

schema_path <- file.path(tmp_dir, "tasks-schema.json")
json_path <- file.path(tmp_dir, "tasks.json")

# DOWNLOAD NECESSARY FILES -----
# Download file and add newline to end
download.file(
    "https://raw.githubusercontent.com/Infectious-Disease-Modeling-Hubs/schemas/add-tasks-docs/tasks-schema.json",
    destfile = schema_path)
write("", file = schema_path, append = TRUE)

# Download json file and add newline to end
download.file(
    "https://raw.githubusercontent.com/annakrystalli/hub-infrastructure-experiments/json-schema-refs/json-schema/modified-hubmeta-examples/complex-hubmeta-mod.json",
    destfile = json_path)

#  SCHEMA ----
# Create schema to validate and serialise
schema <- jsonvalidate::json_schema$new(
    schema = schema_path,
    engine = "ajv")

# Validate unresolved json. Only 3 errors arising from unresolved pointers
schema$validate(json_path, verbose = TRUE) |>
    attr("errors")
#>                                         instancePath
#> 1 /rounds/0/model_tasks/0/task_ids/location/required
#> 2 /rounds/0/model_tasks/1/task_ids/location/required
#> 3 /rounds/1/model_tasks/0/task_ids/location/required
#>                                                                                                                schemaPath
#> 1 #/properties/rounds/items/properties/model_tasks/items/properties/task_ids/properties/location/properties/required/type
#> 2 #/properties/rounds/items/properties/model_tasks/items/properties/task_ids/properties/location/properties/required/type
#> 3 #/properties/rounds/items/properties/model_tasks/items/properties/task_ids/properties/location/properties/required/type
#>   keyword        type            message      schema
#> 1    type array, null must be array,null array, null
#> 2    type array, null must be array,null array, null
#> 3    type array, null must be array,null array, null
#>                                                                                                                                                                         parentSchema.description
#> 1 Array of location unique identifiers that must be present for submission to be valid. Can be null if no locations are required and all valid locations are specified in the optional property.
#> 2 Array of location unique identifiers that must be present for submission to be valid. Can be null if no locations are required and all valid locations are specified in the optional property.
#> 3 Array of location unique identifiers that must be present for submission to be valid. Can be null if no locations are required and all valid locations are specified in the optional property.
#>   parentSchema.type parentSchema.type                                $ref
#> 1       array, null            string #/$defs/task_ids/location/us_states
#> 2       array, null            string #/$defs/task_ids/location/us_states
#> 3       array, null            string #/$defs/task_ids/location/us_states
#>                                             dataPath
#> 1 /rounds/0/model_tasks/0/task_ids/location/required
#> 2 /rounds/0/model_tasks/1/task_ids/location/required
#> 3 /rounds/1/model_tasks/0/task_ids/location/required

Created on 2022-11-21 with reprex v2.0.2

richfitz commented 1 year ago

Hi @annakrystalli - just a note to say we've not missed this or #65, and will get back to you on them when we have time to work it through. This is typically a fairly busy time of year for us. Hopefully will update you before the end of the year. If you had implementation ideas for either of course we'd be happy to discuss or review a PR

annakrystalli commented 1 year ago

Thanks @richfitz ! No worries at all.