Closed anthonytw closed 4 years ago
AFAIU this is not about asking for features to this library but about implementing the standard this library is based on. Also, AFAIS there is already support for "default" values even in the version 7 of the draft:
{
"title": "Feature list",
"type": "array",
"items": [
{
"title": "Feature A",
"properties": {
"enabled": {
"$ref": "#/$defs/enabledToggle",
"default": true
}
}
},
{
"title": "Feature B",
"properties": {
"enabled": {
"description": "If set to null, Feature B
inherits the enabled
value from Feature A",
"$ref": "#/$defs/enabledToggle"
}
}
}
],
"$defs": {
"enabledToggle": {
"title": "Enabled",
"description": "Whether the feature is enabled (true),
disabled (false), or under
automatic control (null)",
"type": ["boolean", "null"],
"default": null
}
}
}
Search for "default" in https://tools.ietf.org/html/draft-handrews-json-schema-02#section-7.7.1.
BTW no idea if this library implements it (it should according to the spec).
Yes, the default field is existing in the standard. It is considered as an annotation and implies no special behavior for validation.
Creating an instance based on the default values in the schema can be interesting though, however it might be out of scope of the validator class.
IMHO default values should imply insertion of them into the given JSON object. Otherwise the app should double check whether the field exists before using it, which somehow invalidates the task of the schema validator by forcing manual checks as if there was not validator.
The validator is not supposed to change the validation-data-instance so it could use the default-value from the schema, but then what?
Again, what could be done is to create a new instance based on the schemas default values - if present, but this would be a new feature to be developed and independent of the validation process.
Philosophically, it's nice to separate validation from the separate process of filling in an incomplete but valid instance. However, practically speaking, it makes much more pragmatic sense to fill in the default values as empty spaces are encountered, because otherwise you either have to:
Neither case is ideal: I don't want to rewrite the schema walk-through, that's why I'm using this library; I don't want to write additional code to check the optional fields and fill in with the default values, because then I not only have to manage the default fields in two locations, I also have to synchronize default values in two locations.
It would be useful to at least offer some option like an "empty field with default callback" so I can make the change myself, if desired, but ideally it would just be an option (or separate function call) to fill in the defaults. One could describe this in a way that is still within the philosophical bounds of a validator: an instance is not valid unless all required and optional fields with defaults are filled in, thus the question becomes not "is the schema valid" but rather "can a valid schema be generated from this potentially incomplete instance" (which would mean optional with default fields filled in) in which case it is more natural to expect a filled-in instance as a returned object (indicating what that valid schema is).
But forget philosophy, practically speaking the library is much less useful if I still have to write additional code to check the optional fields manually when it's so naturally handled by the validator. It's still useful, don't get me wrong, but much less so for very large schemas with potentially many (and changing) schemas, especially when multiple schemas can be used with the same application (i.e., not built in, i.e., now I have to write code to re-parse the schema).
Just some suggestions! I would be happy with the callback idea, but I can't think of anything anyone would do with the callback other than fill in the default, so why over-complicate things?
I see your points.
Did you already implemented some use-cases?
When should a default value be taken into consideration? If a field is missing, but required
or even when not required but a default-value is given in the schema? Should it be replaced when it is present but not valid? Other conditions?
Wouldn't it be enough to create a method in the validator returning a JSON-instance containing all default-fields (recursively filled) and then let the user "merge" it?
Or a two step approach: before calling validate()
a new call of populate_with_default()
is called where the validator would only insert missing fields with its defaults?
My use-cases are always handled the second (quicker) method I mentioned, namely, I validate the schema then I generally instantiate some objects with the contents of the instance, manually handling default values. So this is not general at all.
As I would interpret it, anything field with a default value should have said value substituted if it's missing in the instance. Required + default, in my mind, isn't meaningful in this context. In other words:
To me that's more meaningful than having to mark optional fields as required to have them filled in. However, you could supply three operational modes:
I'm not sure why anyone would ever want 2 though.
I'm personally against the "let the user merge" option. The validator is already recursing through the object, why rewrite that for a merge? I might as well recourse through the schema and fill in default values directly, in which case I'm back to square one.
If it were me, I would offer two functions:
validate
which takes in a const instance and throws a validation error like it does now (or returns an error list, which would be nice, but the subject of a different feature request)validate_and_fill
which validates and fills in defaults along the wayYou can have a separate fill
function instead, but it's probably easier to code if you fill while you validate instead of having to abstract the code to either fill or validate. Not sure, haven't looked at the code.
Thanks for the consideration!
Do you have a concrete use-case as an example for these 4 options?
Two of the cases (without defaults) are already covered by what the library currently does (the instance isn't changed). The use case for the other two (with defaults) is what we've discussed here: fill in missing values with the default. This is exactly what anyone would realistically want to do (currently manually) if they were specifying defaults in the schema. Otherwise, why specify a default?
What I meant with concrete examples is real world examples.
My proposal:
The validator [or maybe another instance] may create a JSON Patch that gives the operations needed to fill-up the original json which default values. It could be the return value of the validate
call.
Then, the user can feel free to apply that patch or simply inspect the patch.
In general, my use case for the default values would be, that the json holds a configuration and the possibility to get the default values would be an easy way to create a config from the scratch.
One more idea:
Adding a kind of default generator
in the way that it creates a json from the scratch using the schema with its defaults. After having a default json the user could diff it with a different json (in my case the config). The resulting patch may contain add operations. Those add operations are the ones that fill up the json (my config) with default values.
Doing this way would not need to change the interface of validate and only needs the default generator
.
Take a look at:
{
"type": "object",
"properties": {
"street_address": {
"type": "string"
},
"country": {
"enum": ["United States of America", "Canada"],
"default": "Canada"
}
},
"if": {
"properties": { "country": { "const": "United States of America" } }
},
"then": {
"properties": { "postal_code": { "pattern": "[0-9]{5}(-[0-9]{4})?" } }
},
"else": {
"properties": { "postal_code": { "pattern": "[A-Z][0-9][A-Z] [0-9][A-Z][0-9]" } }
}
}
In this schema, the default value has an direct impact on the validation of postal codes. I think, for that reasons, the validator should take the default value into account.
In this schema, the default value has an direct impact on the validation of postal codes. I think, for that reasons, the validator should take the default value into account.
Even it would make sense to me, it is not specified that way.
For the validator, country
actually defaults to null
.
I expected this issue to be closed by #89 :thinking:
To have github close an issue automatically I think it needs to be in the commit's subject "Fix #25". As I squashed all of your commits, this info was lost.
Let's close it manually then.
It would be really neat if, in the process of validation, the validator could (optionally) fill in missing values with their defaults if provided in the schema. The logic might look something like this:
This logic validates the element when it is entered or when it's replaced by a default value (unless it's either required or there is no default) which lets you validate your schema default values as well.
(Alternatively, default value validation could take place when the schema is added to the validator object, which is probably a good idea, but more work than using the above logic as a stop-gap that lets library users validate their default values by writing a test that simply does not supply them and expects the validation to pass.)