Closed afharo closed 5 months ago
Thanks, sorry for the delay.
This change seems to break a construct that I'm using in a lot of places. When I want to allow an object to contain additional properties, but have Joi strip them out, I have:
const Joi = require('joi');
const schema = Joi.object({ foo: Joi.string()} )
.unknown(false)
.options({ stripUnknown:true });
console.log(schema.validate({foo: 'abc', bar: 'def'}));
With this change the validation fails:
{
value: { foo: 'abc', bar: 'def' },
error: [Error [ValidationError]: "bar" is not allowed] {
_original: { foo: 'abc', bar: 'def' },
details: [ [Object] ]
}
}
... whereas before it succeeded and stripped out bar
from the value
:
{ value: { foo: 'abc' } }
I guess the reason why I'm using that construct is that I've liberally added .unknown(false).options({ stripUnknown: true })
to the top-level Joi.object
schemas, because I've run into the problem that this is fixing 😅
I'll adapt my code, but this change probably shouldn't have gone out in a patch.
I feared this would happen, but your schema is a bit self-contradictory, explicitly denying unknowns with the expectation that it would be stripped is in my opinion a bug, as local overrides should always have priority over options. I need to think about it a bit more before reverting (or not), but it really looks like you just got lucky that this bug was in your favor.
Yeah, I agree. I think it ended up like that because (a few years back) I found that it was the only way to get Joi.object
to work the way I needed it to :sweat_smile:
I've adopted the new version in all our projects now, so all good from me.
Hi there
I think the stripUnknown is broken since this fix. I have gone through the commit and I think the tests are wrong.
I have added a few additional tests (see zip attached), they all fail now, but this is how I interprete the manual and parameters. It is also closer to behavior <= 17.13.1
kind regards
Alexander
Just sharing my 2 cents, but I'll leave it to the owners to keep me honest.
stripUnknown: true
option.However, I tend to disagree on 3 and 4. Those objects explicitly allow extra properties to go through. The global option shouldn't affect those explicitly set flags.
A use case I can think of:
A logger requires message
and a meta
object containing any additional context to the logs. Inside the meta
, meta.trace_id
is required to be an alphanum of 16 chars.
You want your schema to define message
as string, and meta
as an object with the property trace_id
and an explicit local .unknown()
to allow extra fields to go through the validation.
I hope it makes sense.
I agree with this, as I previously mentioned, local overrides should always have priority over options, and this is what happens here, but we can indeed add a root property to demonstrate that fully.
Hi all
I disagree, I do not understand the use of stripUnknown if it never strips unknown properties. Neither with unknown(true) or unknown(false). Am I missing something obvious here?
kind regards
Alexander
It does when you don't give specific instructions for that schema, basically the default state. There is currently no way to go back to the default state but I think a unknown(null)
is not super hard to implement if that's your need.
Hi, all.
I believe there is a misunderstanding going on here. And I am probably most confused of all. Would somebody try to set me right?
@afharo says:
However, I tend to disagree on 3 and 4. Those objects explicitly allow extra properties to go through. The global option shouldn't affect those explicitly set flags.
A use case I can think of: A logger requires
message
and ameta
object containing any additional context to the logs. Inside themeta
,meta.trace_id
is required to be an alphanum of 16 chars.You want your schema to define
message
as string, andmeta
as an object with the propertytrace_id
and an explicit local.unknown()
to allow extra fields to go through the validation.
I think we all agree that this is a good use case.
@Marsup ’s point that overrides should always have priority over options that have the same semantics is well taken too.
However, I believe .unknown()
and stripUnknown
have separate semantics, so that this does not apply.
The point is that
const schema2 = schema.unknown(false).options({ stripUnknown: true })
would be weird,
but
const schema3 = schema.unknown(true).options({ stripUnknown: true })
makes sense.
When doing the following, the schema is not changed. @Alexander-Muylaert merely asks the transformation Joi does to strip the unknown properties.
const { error, value } = schema.validate(obj, { stripUnknown: true });
A use case for this is as follows: suppose we are handling incoming JSON bodies in a service. Joi is used for validation and to sanitize the input (e.g., for security reasons). Joi's transformation functionality is used for the latter. schemaV1
has .unknown(true)
set, among other options, to allow for future schema evolution. A future version might use schemaV2
, which could include an additional property. A client might already provide that extra property. We can deploy the new client version before we start using the updated service version. While unknown properties are allowed during validation, we want to strip them during transformation and sanitization, recursively.
There are ample other examples where we might want to do this. E.g., suppose we want to write the JSON body in a DynamoDB table we want to keep clean (no unknown properties), but we do not want to flag unknown properties as errors.
From a security standpoint, we should validate first, and possibly transform after that.
That is not what Joi does in all cases. It wouldn't make sense for Joi.string().trim().options({ convert: true })
to not trim first. The same applies to any other use of convert
.
when
true
, attempts to cast values to the required types (e.g. a string to a number). Defaults totrue
.
any.cast(to)
seems to does its job after validation?any.default([value])
before?any.failover(value)
obviously after.any.strip([enabled])
makes it explicit (after).array.single([enabled])
beforeI would expect unknown properties to be stripped after validation, for example, using object.pattern(pattern, schema, [options])
. I believe that, in any case, for this (admittedly unusual) combination, that is the only meaningful option.
The same applies to .unknown(true)
. First validating, and letting pass unknown properties, and then stripping them, makes sense. A request to strip unknown properties after a validation failed because of .unknown(false)
makes no sense. Stripping unknown properties first makes it irrelevant to validate there existence or not. After stripping, .unknown(false)
will be true, and you can say .unknown(true)
, but it doesn’t matter.
.unknown([allow])
does not mention that the value is supposed to be a 3-state. It says
if
false
, unknown keys are not allowed, otherwise unknown keys are ignored.
Note that object.keys([schema])
says
If
schema
is{}
no keys allowed. Ifschema
isnull
orundefined
, any key allowed. Ifschema
is an object with keys, the keys are added to any previously defined keys (but narrows the selection if all keys previously allowed). Defaults to 'undefined' which allows any child key.
So whether unknown properties are allowed in schema
to start with, depends on how it was defined, but assuming it was defined with an object, unknown keys are not allowed if .unknown
was not invoked.
I read this as, for the validation part:
schema // unknown keys are not allowed
schema.unknown(false) // unknown keys are not allowed
schema.unknown(true) // unknown keys are ignored
schema.unknown() // unknown keys are ignored
(This is a bit weird to start with, because schema.unknown()
is, in JavaScript, equivalent to schema.unknown(undefined)
, which also is a falsy parameter value, but that is water under the bridge. I understand @Alexander-Muylaert ’s confusion).
What is completely opaque to me now is what the effect of the following is supposed to be:
schema.options({ stripUnknown: true }) // meaningless
schema.unknown(false).options({ stripUnknown: true }) // meaningless
schema.unknown(true).options({ stripUnknown: true }) // unknown keys are ignored, but are not stripped?
schema.unknown().options({ stripUnknown: true }) // unknown keys are ignored, but are not stripped?
Is this correct? Probably not, because then .options({ stripUnknown: true })
has no effect in any case.
Some comments above seem to mean that instead unknown keys are stripped in 1 case only, before validation?
schema.options({ stripUnknown: true }) // strip first, then unknown keys are not allowed, because that is the default without `.unknown()`
In that light, I understand @Alexander-Muylaert ’s confusion.
What I would expect, is indeed stripping after validation:
schema.options({ stripUnknown: true }) // meaningless
schema.unknown(false).options({ stripUnknown: true }) // meaningless
schema.unknown(true).options({ stripUnknown: true }) // unknown keys are ignored and stripped
schema.unknown().options({ stripUnknown: true }) // unknown keys are ignored and stripped
@afharo, in your use case, why would we ever activate stripUnknown
? Isn’t the better solution here to explicitly not do that, but just set meta.unknown(true)
?
IMO, there's a bit of confusion between allowUnknown
(and its local short form .unknown()
) and stripUnknown
. The way I understand them based on how it works:
allowUnknown
: allows the object to contain extra fields and returns them after sanitation/transformation.stripUnknown
: allows the object to contain extra fields, but removes them during sanitation.The point is that
const schema2 = schema.unknown(false).options({ stripUnknown: true })
would be weird,
but
const schema3 = schema.unknown(true).options({ stripUnknown: true })
makes sense.
Based on my understanding of both settings as explained above, IMO, both of these are weird because they are contradictory (it doesn't matter if they are in the form of schema.unknown(true|false).options({ stripUnknown: true|false })
or schema.options({ allowUnknown: true|false, stripUnknown: true|false })
, since they are local options vs. validation options.
In that sense,
Code | Description |
---|---|
schema.validate(input, { allowUnknown: true, stripUnknown: true }) |
It's contradictory and should never be used. Both allow extra fields to be provided, but one wants to keep them in the output while the other indicates removing them 🤷 |
schema.validate(input, { allowUnknown: false, stripUnknown: true }) |
Again, contradictory, and its usage is not encouraged: at the validation step, one claims that extra properties are not allowed while the other is fine with them. |
schema.validate(input, { allowUnknown: true, stripUnknown: false }) |
Same as above. |
schema.unknown(true).validate(input, { allowUnknown: true }) |
Setting the same option as local and global. Local override always wins. In this case: unknown properties are allowed and available in the output. |
schema.unknown(true).validate(input, { allowUnknown: false }) |
Same as above. |
schema.unknown(false).validate(input, { allowUnknown: true }) |
Setting the same option as local and global. Local override always wins. In this case: unknown properties will fail validation. |
schema.unknown(false).validate(input, { allowUnknown: false }) |
Same as above. |
schema.unknown(true).validate(input, { stripUnknown: true }) |
Setting different options as local and global. Local option wins, which is: unknown options are allowed and should be kept in the output. |
schema.unknown(true).validate(input, { stripUnknown: false }) |
Same as above. |
schema.unknown(false).validate(input, { stripUnknown: true }) |
Setting different options as local and global. Local option wins, which is: validation should fail if unknown fields are found. |
schema.unknown(false).validate(input, { stripUnknown: false }) |
Same as above. |
schema.options({ stripUnknown: true }).validate(input, { stripUnknown: true }) |
Setting the same option as local and global. Local override always wins. In this case: unknown properties are allowed and should be removed from the output. |
schema.options({ stripUnknown: true }).validate(input, { stripUnknown: false }) |
Same as above. |
schema.options({ stripUnknown: false }).validate(input, {stripUnknown: true }) |
Setting the same option as local and global. Local override always wins. In this case: unknown properties will fail validation. |
schema.options({ stripUnknown: false }).validate(input, {stripUnknown: false }) |
Same as above. |
schema.options({ stripUnknown: true }).validate(input, { allowUnknown: true }) |
Setting different options as local and global. Local option wins, which is: unknown properties are allowed and should be removed from the output. |
schema.options({ stripUnknown: true }).validate(input, { allowUnknown: false }) |
Same as above. |
schema.options({ stripUnknown: false }).validate(input, { allowUnknown: true }) |
Setting different options as local and global. Local option wins, which is: validation should fail if unknown fields are found. |
schema.options({ stripUnknown: false }).validate(input, { allowUnknown: false }) |
Same as above. |
Alternative for @afharo ’s use case
@afharo, in your use case, why would we ever activate
stripUnknown
? Isn’t the better solution here to explicitly not do that, but just setmeta.unknown(true)
?
Yes... sorry that I didn't provide an actual code. The way to implement the use case is:
const logEntry = {
message: "Error: Something went terribly wrong",
meta: {
trace_id: "12345",
extra_information: "https://www.youtube.com/watch?v=t3otBjVZzT0",
},
more_details: "...",
};
const loggerSchema = schema.object({
message: schema.string(),
meta: schema.object({
trace_id: schema.string(),
}).unknown(true),
});
const validLogEntry = loggerSchema.validate(logEntry, { stripUnknown: true });
// validLogEntry = {
// message: "Error: Something went terribly wrong",
// meta: {
// trace_id: "12345",
// extra_information: "https://www.youtube.com/watch?v=t3otBjVZzT0",
// },
// }
You could also implement it as loggerSchema.strip(true)
or loggerSchema.options({ stripUnknown: true })
instead of applying the validate option. The difference is that the validate options apply recursively, while the local options don't, so if your use case includes an error
property with selected properties only (to avoid leaking sensitive information), you'd have to implement it as:
const logEntry = {
message: "Error: Something went terribly wrong",
meta: {
trace_id: "12345",
extra_information: "https://www.youtube.com/watch?v=t3otBjVZzT0",
},
more_details: "...",
error: {
message: "Something went terribly wrong",
code: "ETERRIBLYWRONG",
request: {
authentication: "Basic .....",
},
},
};
const loggerSchema = schema.object({
message: schema.string(),
meta: schema.object({
trace_id: schema.string(),
}).unknown(true),
error: schema.object({
message: schema.string(),
code: schema.string(),
}),
});
const validLogEntry = loggerSchema.validate(logEntry, { stripUnknown: true });
// OR
const loggerSchema = schema.object({
message: schema.string(),
meta: schema.object({
trace_id: schema.string(),
}).unknown(true),
error: schema.object({
message: schema.string(),
code: schema.string(),
}).strip(true),
}).strip(true);
const validLogEntry = loggerSchema.validate(logEntry);
// validLogEntry = {
// message: "Error: Something went terribly wrong",
// meta: {
// trace_id: "12345",
// extra_information: "https://www.youtube.com/watch?v=t3otBjVZzT0",
// },
// error: {
// message: "Something went terribly wrong",
// code: "ETERRIBLYWRONG",
// },
// }
Again... it depends on your use case and code preferences/requirements to use it as a local option or global validate option. The global option avoids code repetition while the local ones are more explicit. In either case, you don't want stripUnknown
to remove the properties from the fields that explicitly allow them in their local options.
When explicitly specifying
.unknown(false)
, running validations withstripUnknown: true
should respect the local override.