This PR implements the first Minimum Viable Product of hubValidations functionality
General functionality consists of higher level functions containing multiple modular unit checks.
The higher level validate_* functions return objects (list) of class hub_validations.
Unit check_*functions return the output ofcapture_check_cnd(), one of:
<message/check_success> condition class object.
<warning/check_failure> condition class object.
<error/check_error> condition class object.
Depending on whether validation has succeeded and whether the check has been designed to retunrn a warning or error on failure. Returned objects also inherit from subclass <hub_check>.
Functionality added:
[x] validate_model_file() high level function running a number of unit checks on submitted model output data file.
check_file_exists(): checks that file exists at path specified by file_path. Returns check_error and triggers early return of caller fn.
check_file_name(): check that file name is correctly formatted and can be parsed for metadata. Returns check_error and triggers early return of caller fn.
check_file_location(): checks file is being submitted to correct location
check_valid_round_id(): check round_id is valid. Returns check_error and triggers early return of caller fn.
check_file_format(): checks that submitted file has file extension readable by hubValidations. Returns check_error and triggers early return of caller fn.
[x] validate_model_data() high level function running a number of unit checks on submitted model output data file contents. #5
check_file_read(): checks that file can be read. Returns check_error and triggers early return of caller fn.
check_valid_round_id_col(): if applicable, checks that the round_id column can be parsed from the config file and/or is a valid column in tbl.
check_tbl_unique_round_id(): if applicable, checks that tbl contain single unique round ID in round id column. Returns check_error and triggers early return of caller fn.
check_tbl_colnames(): checks that tbl has expected columns according to config. Returns check_error and triggers early return of caller fn.
check_tbl_col_types(): checks that tbl column types match schema expected from config. #12
check_tbl_values(): checks that combinations of values in each row are valid combinations expected from config. #10
check_tbl_rows_unique(): checks that combinations of values in each row are unique.
check_tbl_values_required(): checks that all required combinations of values are present. #17, #10
check_tbl_value_col(): checks that value in the value column conform to expectations of type, min and max (where applicable) according to config. #19
check_tbl_value_col_ascending(): for cdf & quantile output types, checks that values in value column are non descending for each unique task ID value combination. #19
check_tbl_value_col_sum1(): for pmf output type, checks that values in value column sum to 1 for each unique task ID value combination. #19
Still to do:
[x] validate_model_metadata(): high level function running a number of unit checks on model metadata file. (@elray1) #7
[ ] Framework for running custom/optional validation functions/functions that require additional arguments.
[ ] validate_submission_file(): Top level function that wraps validate_model_file(), validate_model_data() and validate_model_metadata().
This PR implements the first Minimum Viable Product of
hubValidations
functionalityGeneral functionality consists of higher level functions containing multiple modular unit checks.
The higher level
validate_*
functions return objects (list) of classhub_validations
. Unitcheck_*
functions return the output ofcapture_check_cnd()
, one of:<message/check_success>
condition class object.<warning/check_failure>
condition class object.<error/check_error>
condition class object. Depending on whether validation has succeeded and whether the check has been designed to retunrn a warning or error on failure. Returned objects also inherit from subclass<hub_check>
.Functionality added:
[x]
validate_model_file()
high level function running a number of unit checks on submitted model output data file.check_file_exists()
: checks that file exists at path specified by file_path. Returnscheck_error
and triggers early return of caller fn.check_file_name()
: check that file name is correctly formatted and can be parsed for metadata. Returnscheck_error
and triggers early return of caller fn.check_file_location()
: checks file is being submitted to correct locationcheck_valid_round_id()
: check round_id is valid. Returnscheck_error
and triggers early return of caller fn.check_file_format()
: checks that submitted file has file extension readable byhubValidations
. Returnscheck_error
and triggers early return of caller fn.[x]
validate_model_data()
high level function running a number of unit checks on submitted model output data file contents. #5check_file_read()
: checks that file can be read. Returnscheck_error
and triggers early return of caller fn.check_valid_round_id_col()
: if applicable, checks that the round_id column can be parsed from the config file and/or is a valid column in tbl.check_tbl_unique_round_id()
: if applicable, checks that tbl contain single unique round ID in round id column. Returnscheck_error
and triggers early return of caller fn.check_tbl_colnames()
: checks that tbl has expected columns according to config. Returnscheck_error
and triggers early return of caller fn.check_tbl_col_types()
: checks that tbl column types match schema expected from config. #12check_tbl_values()
: checks that combinations of values in each row are valid combinations expected from config. #10check_tbl_rows_unique()
: checks that combinations of values in each row are unique.check_tbl_values_required()
: checks that all required combinations of values are present. #17, #10check_tbl_value_col()
: checks that value in thevalue
column conform to expectations of type, min and max (where applicable) according to config. #19check_tbl_value_col_ascending()
: for cdf & quantile output types, checks that values in value column are non descending for each unique task ID value combination. #19check_tbl_value_col_sum1()
: for pmf output type, checks that values in value column sum to 1 for each unique task ID value combination. #19Still to do:
validate_model_metadata()
: high level function running a number of unit checks on model metadata file. (@elray1) #7validate_submission_file()
: Top level function that wrapsvalidate_model_file()
,validate_model_data()
andvalidate_model_metadata()
.validate_pr()
high level function for runningvalidate_submission_file()
on a file submitted through a PR. Functioniality to be largely based on https://github.com/covid19-forecast-hub-europe/HubValidations/blob/main/R/validate_pr.R