Closed samuelcolvin closed 2 months ago
Correct.
I think you might be right about strict mode on this context, even with something as apparently obvious as "strict mode" there are edge cases where we do do some coersion.
Options are:
strict=True
- I guess this is my preference (except for the below problem)But like I said, I'm not really a consumer of "strict mode"
Well, no one is right now as it's not released, hence why observations like this are so helpful.
I think your argument about type checking is a valid one.
However there's one difficult edge case - strict mode, as well as being configurable by users, is also used on the first pass for Unions, I suspect that's why I have it the way I do right now.
Example, let's say you have something like this:
class ParentType(BaseModel):
x: int
class ChildType(ParentType):
y: int
ParentOrChild = ParentType | ChildType
The union validator would do roughly the following:
ParentType
in strict mode? if so return itChildType
in strict mode? if so return itParentType
in non-strict mode? if so return itChildType
in non-strict mode? if so return itIn a scenario like this this, I can imagine ChildType(x=1, y=2)
being strictly valid as a ParentType
might loose information.
But I'm not sure about this, it's some time since I wrote this part, might need some trial and error to check.
Thanks for creating this issue.
I think if (raw) data is valid for both ParentType
and ChildType
, it's fine to choose how to parse ParentType | ChildType
just by order (i.e., with this ordering, never parse raw data as ChildType
lol); the main issue I have is if you were to have ParentType
, ChildType
, and SubChildType
, and from a python init call you pass an instance of SubChildType
to a field of type ParentType | ChildType
, ideally it would be treated as being of type ChildType
even if it could be parsed as ParentType
. ... I think...
In general, I think there's a strong argument to be made for strict mode allowing subclasses, if we can work through the other implications.
(I'm still in favor of doing validation/serialization as a function of the schema of the model being parsed/serialized though, rather than supporting inheritance-based overrides for fields; as far as I can tell we are in agreement about this.)
I think we're in agreement on both, just need to work out if it's going to cause ugly edge cases with union validation.
Marking this as closed for now - we've resolved most of these questions by designing the logic for smart
mode union validation, as outlined here: https://docs.pydantic.dev/latest/concepts/unions/#smart-mode
@samuelcolvin It seems to me that the approach you are describing (especially given point 2) is along the lines of what I'm thinking of as the "schema-based" approach (i.e., where a serializer is built for the
Outer
model at the time it is created, which I think is the right way to go). I am in favor of this approach, for the exact reasons you described.There are a few points I'd like to bring up in response to your last comment, though admittedly they are somewhat tangential; hopefully they don't derail this thread, and I'm happy to discuss elsewhere if more appropriate.
In the Rust code you referenced, I noticed that it appears that in strict mode you can't pass subclasses of types into the appropriate fields (since it checks for exact type equality).
__init__
signatures, etc..Inner
, you can't know that it isn't in fact returning aSubclassOfInner
. This would mean that using strict mode would be able to introduce (what are functionally) type errors not detectable by a type checker (bad), whereas non-strict mode would result in there being "correct" code that fails a type checker (less bad).Because validators would be re-executed when passing a subclass into a field, we may want to re-emphasize the potential pitfalls of using non-idempotent validators (maybe this is already sufficiently emphasized in the docs, just pointing it out).
Since under this logic data gets "coerced" (through re-validation) to the annotated class during validation, it seems to me that there may be a bigger risk of issues being introduced by (unvalidated) assignment to an attribute. I haven't thought super hard about the implications; maybe it's not important, and either way I think it's something we can investigate and/or resolve once the appropriate functionality is in place.
At any rate, I'll be interested to dig into your approach to Unions once it's ready!
Originally posted by @dmontagu in https://github.com/pydantic/pydantic-core/issues/327#issuecomment-1376293733