pydantic / pydantic

Data validation using Python type hints
https://docs.pydantic.dev
MIT License
20.52k stars 1.84k forks source link

ambiguous union discrimination when multiple models correspond to the input arguments #9513

Closed MarcBresson closed 3 months ago

MarcBresson commented 3 months ago

Initial Checks

Description

Hello,

I'm building an app and stumbled on a not reliable behaviour of pydantic. Basically, when the input corresponds to multiple base model, the one that is going to be selected in the end will depend on unknown things. See the example below for more context.

Example Code

The following :

from pydantic import BaseModel

class SuperEnum(str, Enum):
    ENUM_1 = "ENUM_1"
    ENUM_2 = "ENUM_2"

class A(BaseModel):
    one: ONE_ENUM

class B(BaseModel):
    two: int
    three: str | None = None

class C(BaseModel):
    zero: A | B

will return

C(**{"zero": {"one": "ENUM_1"}}).zero
>>> A(one=<SuperEnum.ENUM_1: 'ENUM_1'>)
C(**{"zero": {"one": "ENUM_1", "two": 3}}).zero
>>> B(two=3, three=None)

but

from pydantic import BaseModel

class A(BaseModel):
    one: str  # difference here, now it is just a string instead of the enum

class B(BaseModel):
    two: int
    three: str | None = None

class C(BaseModel):
    zero: A | B

will return

C(**{"zero": {"one": "ENUM_1"}}).zero
>>> A(one='ENUM_1')
C(**{"zero": {"one": "ENUM_1", "two": 3}}).zero
>>> A(one='ENUM_1')

Python, Pydantic & OS Version

pydantic version: 2.6.4
        pydantic-core version: 2.16.3
          pydantic-core build: profile=release pgo=true
                 install path: /Users/datategy/Documents/o2_ml/.venv/lib/python3.10/site-packages/pydantic
               python version: 3.10.11 (v3.10.11:7d4cc5aa85, Apr  4 2023, 19:05:19) [Clang 13.0.0 (clang-1300.0.29.30)]
                     platform: macOS-14.4.1-arm64-arm-64bit
             related packages: fastapi-0.110.0 typing_extensions-4.10.0 pydantic-settings-2.2.1
                       commit: unknown
sydney-runkle commented 3 months ago

@MarcBresson,

Thanks for your question. This is a consequence of pydantic's smart union matching logic, that tries to find the best match out of the possible union candidates. If you'd prefer to always match A when one is present, you can use union_mode='right_to_left' in the config for the zero field, which will ensure that the first match is the one returned.

Let me know if you have any other questions! :)