Closed annakrystalli closed 2 months ago
Thanks @zkamvar !
Question: Are round-trips valid? That is, if you read in a valid schema file, modify it in some way (e.g. updating cdf values), and write it out again, would it still be valid?
Unfortunately not. These array/scalar errors will be re-introduced everytime the file is re-written.
Comment: (plz let me know if this has been discussed and invalidated before, I wasn't able to find anything when searching github) This is likely best addressed in a separate issue in the schemas repository, but I'm noticing that one of the problems is that we cannot perform the schema validation until the config is fully formed. I'm wondering if it would be prudent to create sub-schemas for the output types since these seem to cause the most headaches (e.g. your comment about json_serialize not being able to use oneOf clauses).
There is validation going on in the background but admittedly in places duplicating in R what jsonvalidate
validation can perform. That was somewhat because the outputs I wanted in the create_*()
family of fns was different to the output required for the validation errors table.
If I understand them correctly, if we use sub-schemas (as demonstrated in https://github.com/ropensci/jsonvalidate/issues/61 using the Direct Reference keyword), we could apply validation during creation of the individual pieces that we know we can validate. This way, it might be easier for us to apply workarounds since the number of variables that cause an invalid file are smaller.
I'd certainly be up for streamlining such code and sub-schema might be a good option. Here's some concerns I have:
It's probably worth an issue for further discussion somewhere else.
This PR adds a
write_config()
function for writing outconfig
class objects of programmatically created task configurations to JSON files.It's been put off for some time because it's almost impossible to write a valid config file due to inconsistencies between R and JSON data types, in particular the fact that R has no concept of a scalar. As such some properties in the output file will likely not conform to schema expectations. They might be an array when a scalar is required or vice versa.
I tried to make use of the
jsonvalidate::json_serialise()
functionality, which can help serialise R objects in a schema aware way but I gave up because:jsonvalidate
javascript source code.jsonvalidate::json_serialise()
has been for some time now (since last year) only available in the development version ofjsonvalidate
with no indication when it might be pushed to CRAN.json_serialise()
cannot make use ofoneOf
statements which we use in our schema.See code
``` r library(hubAdmin) rounds <- create_rounds( create_round( round_id_from_variable = TRUE, round_id = "origin_date", model_tasks = create_model_tasks( create_model_task( task_ids = create_task_ids( create_task_id("origin_date", required = NULL, optional = c( "2023-01-02", "2023-01-09", "2023-01-16" ) ), create_task_id("location", required = "US", optional = c("01", "02", "04", "05", "06") ), create_task_id("horizon", required = 1L, optional = 2:4 ) ), output_type = create_output_type( create_output_type_mean( is_required = TRUE, value_type = "double", value_minimum = 0L ) ), target_metadata = create_target_metadata( create_target_metadata_item( target_id = "inc hosp", target_name = "Weekly incident influenza hospitalizations", target_units = "rate per 100,000 population", target_keys = NULL, target_type = "discrete", is_step_ahead = TRUE, time_unit = "week" ) ) ) ), submissions_due = list( relative_to = "origin_date", start = -4L, end = 2L ) ) ) schema <- hubAdmin:::download_tasks_schema(format = "json") config <- create_config(rounds) jsonvalidate::json_serialise(config, schema) #> Error in eval(expr, envir, enclos): TypeError: Cannot convert undefined or null to object ```As such, I've added a warning in the docs about this as well as issuing a message on write, advising users of the pitfalls and directing to use
validate_config()
to validate the config files written out and identify any deviations.The PR therefore resolves #3 and completes #7