Closed sbliven closed 3 weeks ago
What does everyone think of the use of JSONPath?
Pro: Simple syntax for simple reference, while keeping enough power for more complex structures like arrays
Pro: Well-known standard (eg used by jq
command)
Con: Adds a dependency
Con: Only checks element presence, not type
A more powerful option would be to validate against a JSON Schema. This would allow type checking. However I think it would be much more verbose.
I think the JSONpath solution is appropriate for now.
However, there is a minor issue in the jobs.controller: performJobCreateAction
now accesses only the jobInstance
of JobClass
as an argument (this also needed to call the other function that performs the actions). performJobCreateAction
should take two arguments, the second being of JobDto
type so that we can call the validate
function. Now in the controller validate
is called on JobClass
instance, which is not how validate
is defined.
Good catch, @sofyalaski. Why does this even compile?
Ah, it compiles because JobClass is structurally compatible with CreateJobDto. Working on a fix now.
This implementation has a couple problems.
First, the JSONPath solution doesn't perform any type checking. It would be nice to do something like
{
"actionType": "validate",
"required": {
"jobParams.datasetIds": "object",
"jobParams.datasetIds[*]": "string"
}
}
This looks fine for simple types (string, number, list of literal values) but could quickly grow in complexity until it reaches a full type system. One solution would be to supply a JSON Schema to validate the DTO (which provides full type support). However including JSON schemas in the config file seems overly complex for operators. Is there a happy medium?
Second, the validate method currently acts against the DTO body. Half of the checks in v3 are actually against database entities referenced in the DTO or Job itself. For instance, "archive" jobs should check that all datasets associated with the job have "datasetLifecycle.archivable". Or "publish" jobs with file listings should check that all files exist, requiring checking associated DataBlocks from the dataset. Should these be added to a generic ValidateAction or be a more specific check? We are leaning towards handling those as special cases in the jobs controller.
I changed how this works somewhat. Now the ValidateAction config looks like:
{
"actionType": "validate",
"request": {
"JSONPath": JSONSchema,
...
}
}
The schema gets validated by avj, so you theoretically could use the full power of a schema. However combining it with the path allows most use cases to be covered by short, easily readable one-liners. Examples:
{
"actionType": "validate",
"request": {
"jobParams.name": { "type": "string"}, // match simple types
"jobParams.answers[*]": { "enum": ["yes","no"]}, // literal values (here applied to an array)
"jobResultObject": {"$ref": "https://json.schemastore.org/schema-org-thing.json"}, // JSON Schema
}
}
I spelled the top-level request
to because it gets applied to the DTO (either the create or update DTO). If it needed to get applied somewhere else (eg to associated datasets) we can add that in the future.
Agreed to merge https://github.com/SciCatProject/scicat-backend-next/pull/1440, then remove checkDatasetState
from the controller and incorporate its logic into the validate
action, which will allow for configuring jobType - datasetLifecycle
combinations and deleting the hardcoded job-types.enum
. These changes will come in a new PR which will also incorporate this: https://github.com/SciCatProject/scicat-backend-next/issues/1120
Description
Add ValidateAction to check Job DTO formatting
Motivation
When creating a job clients can provide an untyped
jobParams
object in the DTO, or ajobResultObject
when updating. These were previously not validated.Fixes:
This PR adds a new
validate
job action which can test for the presence of particular attributes in the DTO. Currently the attributes are specified using a JSONPath and are only verified to be non-null. If needed, future implementations could add JSONSchema support to do deeper validation.For example, the following
jobConfig.json
section would configure thepublic
job to require at least one datasetId:Changes:
validate
actionTests included
Documentation