Open lance0805 opened 8 months ago
The schema states string
the value passed is clearly an integer. I don't think the validator is wrong here, it's the design of the API that is incorrect. The schema needs to be changed.
I agree with your point, especially from a strict interpretation.
The challenge I'm facing is providing metrics for a gateway, which includes logging request paths. To prevent metrics from becoming overwhelming due to numerous path variables, we convert actual paths to OpenAPI routes.
This means even incorrect request parameters, like in the example /user/123456
, are recorded in a standardized route format /user/{username}
.
The purpose is to ensure even erroneous requests are accurately logged, represented as request_xxx{path="/user/{username}", statusCode="4xx"}
in our metrics.
validating a string is pretty pointless when, from a library perspective, you do not know what it is supposed to contain, as a string can more or less contain anything. For example, if you want to be strict, you'd have to validate against everything that someone can imagine not to be a string. A boolean, a timestamp, someone inserting a json into that string, etc. Usages one cannot validate because the library does not know. This check gives the impression of strictness but what it does is just a basic validation against one if many types which is imo pointless because a string can contain anything that is serializable as a utf8 string. The interesting part is the other way around, an integer cannot be a string, as it has a well defined format.
I agree with your point, especially from a strict interpretation. The challenge I'm facing is providing metrics for a gateway, which includes logging request paths. To prevent metrics from becoming overwhelming due to numerous path variables, we convert actual paths to OpenAPI routes. This means even incorrect request parameters, like in the example
/user/123456
, are recorded in a standardized route format/user/{username}
. The purpose is to ensure even erroneous requests are accurately logged, represented asrequest_xxx{path="/user/{username}", statusCode="4xx"}
in our metrics.
I don't want to take away this strictness from the validator, in my opinion - it's doing the right thing. What you need is more flexibility to relax the rules right?
validating a string is pretty pointless when, from a library perspective, you do not know what it is supposed to contain, as a string can more or less contain anything. For example, if you want to be strict, you'd have to validate against everything that someone can imagine not to be a string. A boolean, a timestamp, someone inserting a json into that string, etc. Usages one cannot validate because the library does not know. This check gives the impression of strictness but what it does is just a basic validation against one if many types which is imo pointless because a string can contain anything that is serializable as a utf8 string. The interesting part is the other way around, an integer cannot be a string, as it has a well defined format.
I don't agree. The job of the validator is to check if what you are doing
is what you said you would do
. It does not care about context.
If the schema states 'string' and you pass in anything other than a number
, bool
, array
or an object
.. you're golden. If the string is of a specific type, then use format
which will give another level of validation.
1234
and 123.4
are not strings, they are numbers. They will always be numbers.
There is always the option to tell the validator to 'pretend they are not numbers for a second', but saying this is 'pointless' isn't accurate at all. The whole purpose of this code is to validate as best it can.
And, remember folks - if you disagree with my opinions :)
my point is: validading that it is not a number does not imply that it is a string, as it could be a boolean or anything else that you do not know.
To me, this makes perfect logical sense.
// check if the param is a number or not
schema := params[p].Schema.Schema()
for t := range schema.Type {
switch schema.Type[t] {
case helpers.String, helpers.Object, helpers.Array:
// should not be a number.
if _, err := strconv.ParseFloat(s, 64); err == nil {
s = helpers.FailSegment
}
case helpers.Number, helpers.Integer:
// should not be a string.
if _, err := strconv.ParseFloat(s, 64); err != nil {
s = helpers.FailSegment
}
// TODO: check for encoded objects and arrays (yikes)
}
}
}
If the schema is defined as a string, an object or an array.. check if it's numeric, if it is.. its no good.
If the schema is defined as numeric, check the input is numeric, if it's not, it is no good.
Why is this pointless?
if you only look at those two cases and if there are and always will be only these two cases: case 1: string, object, array case 2: number, integer
To some degree it makes sense that the (positive) validation logic of the numbers-case is inversed (negative) in the other case.
Imo, it would be better if the first case was also a "positive" validation, meaning it is a string, it is an object, it is an array contrary to it not a number and not whatever else may be added in the future.
That would lead to the positive check: "It is a valid utf8 string" which may contain anything that is valid utf8. I'm only speaking about the string case (ignoring object and array). Inherently a string can contain anything like uuids, boolean values or mathematically precise numbers (interesting in the financial sector).
Maybe my assumption is incorrect what this code is exactly targeting, is it the format of a string type or is it the actual string type itself?
I agree with your point, especially from a strict interpretation. The challenge I'm facing is providing metrics for a gateway, which includes logging request paths. To prevent metrics from becoming overwhelming due to numerous path variables, we convert actual paths to OpenAPI routes. This means even incorrect request parameters, like in the example
/user/123456
, are recorded in a standardized route format/user/{username}
. The purpose is to ensure even erroneous requests are accurately logged, represented asrequest_xxx{path="/user/{username}", statusCode="4xx"}
in our metrics.I don't want to take away this strictness from the validator, in my opinion - it's doing the right thing. What you need is more flexibility to relax the rules right?
Yes, having a function that does not require validation, or one that allows for custom validation logic, would better suit my use case.
For example, in this schema, the username is defined as a string type. If called in this way, it will lead to the inability to find the route:
https://github.com/pb33f/libopenapi-validator/blob/657229e4dc4f54ea6d298dec56fec8b500baa3e7/test_specs/petstorev3.json#L818-L836
is it possible to remove the string validation here since numbers can be safely used as strings?
https://github.com/pb33f/libopenapi-validator/blob/657229e4dc4f54ea6d298dec56fec8b500baa3e7/paths/paths.go#L284-L288