Open sisp opened 1 year ago
So, right now, when we get unparametrized typevars we handle it in pydantic core by treating a typevar with constraints as the union of the constraints, but as you have noted, that is not actually what is semantically meant.
I'll note that this actually isn't just an issue with the JSON schema — it's an issue with the actual validation:
from typing import Generic, TypeVar
from pydantic import BaseModel
from pydantic.types import StrictInt, StrictStr
T = TypeVar("T", StrictInt, StrictStr)
class M(BaseModel, Generic[T]):
x: T
y: T
print(M(x=1, y='a'))
#> x=1 y='a'
(No validation error, even though this is a type error.)
I think supporting this properly would involve some changes to pydantic core to actually support this validation. In particular, you have to have an understanding of the scope of the typevar, which I think we currently don't track in pydantic-core schemas. But if you do know the scope of the typevar, then you can make this work by converting a type with that typevar into a union over all of its constraints, and if we could do that, we'd get a proper JSON schema "for free".
It seems to me that it would be difficult to make it work, but I'm not totally sure. At the very least, I think we could make it work for TypeAliasType
with a typevar with constraints. Specifically, if we changed this line: https://github.com/pydantic/pydantic/blob/2acf1aff55a56f5cfe41f286350399007e7e425a/pydantic/_internal/_generate_schema.py#L988 to do something a bit different if the typevars have constraints vs. bounds, I think it shouldn't be too much work. Longer term, there might be a code migration path for pydantic where we try to automatically wrap generic types in a type alias type to achieve the same behavior, but I'm perhaps getting a bit ahead of myself here.
I'd be open to a PR to make this work for TypeAliasType following the suggestion above, or could get to it myself eventually.
In the more realistic short term, depending on your usage, there might be some workarounds. One thing you could do is:
ConcreteM = Union[M[StrictInt] | M[StrictStr]]
which should at least result in the right JSON schema and even field validation, but is no longer actually a model, and won't automatically match all the constraints on the typevar
Another tricky thing that will generate the right JSON schema for the model (though won't validate according to it, sadly), is:
from typing import Generic, TypeVar, Union
from pydantic_core import CoreSchema
from pydantic import BaseModel, TypeAdapter, GetJsonSchemaHandler
from pydantic.json_schema import JsonSchemaValue
from pydantic.types import StrictInt, StrictStr
T = TypeVar("T", StrictInt, StrictStr)
class M(BaseModel, Generic[T]):
x: T
y: T
@classmethod
def __get_pydantic_json_schema__(
cls,
__core_schema: CoreSchema,
__handler: GetJsonSchemaHandler,
) -> JsonSchemaValue:
if cls.__mro__[0] == M: # this prevents the infinite recursion
type_ = Union[tuple(M[x] for x in T.__constraints__)]
__core_schema = TypeAdapter(type_).core_schema
return __handler(__core_schema)
print(M.model_json_schema())
"""
{'$defs': {'M_Annotated_int__Strict_strict_True___': {'properties': {'x': {'title': 'X', 'type': 'integer'}, 'y': {'title': 'Y', 'type': 'integer'}}, 'required': ['x', 'y'], 'title': 'M[Annotated[int, Strict(strict=True)]]', 'type': 'object'}, 'M_Annotated_str__Strict_strict_True___': {'properties': {'x': {'title': 'X', 'type': 'string'}, 'y': {'title': 'Y', 'type': 'string'}}, 'required': ['x', 'y'], 'title': 'M[Annotated[str, Strict(strict=True)]]', 'type': 'object'}}, 'anyOf': [{'$ref': '#/$defs/M_Annotated_int__Strict_strict_True___'}, {'$ref': '#/$defs/M_Annotated_str__Strict_strict_True___'}], 'title': 'M'}
""""
which I think is essentially the JSON schema you want, at least ignoring the definitions. (Not sure if that's a problem; if so, maybe we could add better support for automatically in-lining definitions where possible.) That said, I hope we can eventually come up with a better pattern for this or support it properly even from pydantic-core.
Thanks for your prompt and detailed reply, @dmontagu! :bow: I hadn't noticed that the validation is also affected, which is much worse than the JSON Schema generation.
I'm not so familiar with Pydantic's internals and they appear quite non-trivial, it might be a steep learning curve for me to figure out how to make the right changes. Not sure if this makes sense TBH. :confused:
Initial Checks
Description
The generated JSON Schema for a generic model with multiple attributes using the same type parameter is not correct. Consider the example code below which generates the following JSON Schema:
But since
x
andy
use the same type parameterT
, their types are coupled, soanyOf
for each property is not correct. Instead, the schema should be this:I realize that this may be difficult to generalize to more complex cases, and it's partially JSON Schema's fault because it doesn't have first-class support for type parameters which results in very verbose schemas. Nevertheless, I felt it's worth sharing this observation.
Example Code
Python, Pydantic & OS Version
Selected Assignee: @samuelcolvin