Open IAmTomahawkx opened 1 year ago
Thanks for opening this issue.
Can you explain your use case here? What do you hope to do with this information? What are the types of expected_type
/received_type
/location
here? Strings? Something else?
Note that not all ValidationError
messages have an expected_type
/received_type
, but all of them do have a location
.
@IAmTomahawkx any reply on the above?
I did realize after making the issue that not all ValidationErrors have the same messages. Personally my intended usecase would be to return a structured response from an API endpoint instead of an error string.
I suppose this could in theory be solved by making a subclass of ValidationError for type errors (TypeValidationError?), with added attributes. This should keep compatibility to existing codebases using except ValidationError:
while allowing for more toned in error handling.
As for the types, personally I don't care (as I'd be stringing them to send out as json anyways), but i suppose it would make most sense as actual types instead of strings. How feasible that is, i ain't familiar with the codebase so that's not my call.
Personally my intended usecase would be to return a structured response from an API endpoint instead of an error string.
Can you comment with an example of the kind of structured message you'd like to be able to send?
The idea was to match our other validation responses, eg
{
"errType": "InvalidType",
"errZone": "body", // "query", "path", etc
"errLoc": "$.files[0].content",
"received": "integer",
"expected": "string"
}
Whereas currently we simply have a separate type for body errors:
{
"errType": "InvalidBodyType",
"errMsg": "Expected `str`, got `bool` - at `$.files[0].content`"
}
Ok. I think we can support this use case.
To do this we'd only include the stringified received
/expected
components, since it's hard to backtrack out the actual python type instances from our own internal representation.
Are the current string versions of expected
/received
/path
sufficient for your use case?
For errors that don't have an expected/received type included in the message, we could do one of 3 things:
None
when no expected/received type is included in the messageexpected
/received
are generally of one of two forms:
expected
and received
would be equal)str
value, and some parsing issue occurred (in this case expected
would be the expected type, and received
would be str
).I don't have a strong preference towards any of these options in particular.
Thoughts?
I think the first option would be the best, as it allows the most flexibility for changes to other errors (although location may be able to go on the base ValidationError). its also quite easy to identify the exact error being presented to you, as opposed to checking the values to determine what's going on. As for the stringified stuff, seems fine to me!
Sounds good to me! Thanks for the helpful conversation here, I think we've arrived at a good solution. Should hopefully be able to get this resolved before the next release.
Hi ! Just pitching in to describe my use case since this issue is exactly the one I was looking for :)
I have 2 for structured errors:
As an example, I use a tagged union to deserialize messages and i want to handle the error differently based on whether the tag field is invalid ("unhandled message type") or any other fields have errors ("invalid payload").
The solution you have been discussing here works well since all we need is some way to identify the field. A couple of related wishes for you consideration:
if the message was succesfully decoded but failed to validate, it would be neat to have access to the decoded message when processing the exception
And since this is my first time interacting with this project: thank you for your work ! It's a fantastic library you have here and it looks like it will save me a few headaches :)
This would also allow Litestar to customize our client errors with less work (i.e., not having to rely on regex to parse the error message).
This wouldn't require any more than you have already detailed here.
I think the above discussion covers the issues I've presented in #468.
Also, I think something similar to pydantic's ErrorDetails
class would be great. This has an attribute type
which specifies whether the field is missing
or invalid
etc. Having something like that in the exception would be great as well. That is, details regarding whether the validation failed due to the field being missing, field has incorrect type or field has valid types, but the validation logic fails.
One thing, I wanted to add which I only realized now is that I think the way that msgspec
does validation, it'll stop immediately when the first invalid input is reached correct? Or is my understanding incorrect? Would it be possible to keep going to be able to catch any other validation errors? This would help when returning an API response with a list of all the fixes to make instead of the user having to make the API request each time and fix errors one by one.
One thing, I wanted to add which I only realized now is that I think the way that
msgspec
does validation, it'll stop immediately when the first invalid input is reached correct? Or is my understanding incorrect? Would it be possible to keep going to be able to catch any other validation errors? This would help when returning an API response with a list of all the fixes to make instead of the user having to make the API request each time and fix errors one by one.
Hi @guacs - I've previously asked a similar question for which @jcrist went to some effort to explain his rationale for the "fail fast" behavior - I'll quote it here as it deserves visibility (ref):
No, this is not something I intend on supporting. For prior art, most typed JSON serialization libraries (across language ecosystems) don't support raising multiple errors. Neither golang's json nor rust's serde support this, and people generally love those tools.
When parsing JSON into a specified object type, there are multiple ways an error can occur:
- The JSON is invalid
- The JSON is valid, but the JSON type is incorrect (expected a number, got an array)
- The JSON type is correct, but the value is invalid (e.g. unknown Enum value, int out of range, invalid datetime str etc...)
- A JSON object is missing a field (or in certain situations has an extra field)
When parsing and validating, a failure in one location could be one error, or several. There's no clearly defined behavior for what should be expected. pydantic might group a few errors together and say "these are the things that are invalid". cattrs might find a different set of errors. But there's no clear definition of what "all the errors" means. Say for example you're parsing the value "2022-03-" into a type datetime | date | None. This string is not a valid datetime or date, and it isn't None. Is that one, two, or three errors?
Further, what if you receive a list of 1000 strings as an input for a field, but expected a list of ints. Is that 1000 errors (one for each item), or one for the list?
Maintaining a running structure of errors like this can get expensive, and make the failure mode in parsing significantly more expensive than the success mode. I'm skeptical of its general utility, and think if providing errors for multiple fields is important, you'd be better served by doing that validation yourself after parsing, so you can ensure the kinds of user-facing errors you wish to provide are uniformly handled, rather than whatever errors the parsing framework happens to collect.
Raising the first error found is easier to make consistent, easier for a user to build intuition around (you can look at a given message and a given type and know for sure what error will be raised), and much much more performant in the case of failures.
@peterschutt thank you for pointing this out. If this is something that's not being planned to be supported, I'm completely fine with that actually. Even though there a few points I'm not sure I agree with completely, I get the rationale and think it's reasonable.
Sounds good to me! Thanks for the helpful conversation here, I think we've arrived at a good solution. Should hopefully be able to get this resolved before the next release.
@jcrist do you have a timeline in mind for this nowadays?
It would be great if we could get the actual values of validation errors passed to ValidationError as attributes. Currently the best way to do this is to pass a regex over the error string ("Expected
str
, gotbool
- at$.files[0].content
") to get the 3 values,expected_type
,received_type
, andlocation
. From my uneducated view, it shouldn't be too hard to pass those through, as they're already available when the error string is assembled.