pydantic / pydantic

Data validation using Python type hints
https://docs.pydantic.dev
MIT License
20.45k stars 1.83k forks source link

support Literal to define constants #561

Closed jasonkuhrt closed 5 years ago

jasonkuhrt commented 5 years ago

I am trying to achieve something like this:

from typing_extensions import Literal

class A(BaseModel):
  kind: Literal['a']
  foo: str

class B(BaseModel):
  kind: Literal['b']
  bar: str

class C(BaseModel):
  results: Union[A, B]

But encountering error:

...
File "/Users/jasonkuhrt/.pyenv/versions/3.7.2/lib/python3.7/typing.py", line 713, in __subclasscheck__
    return issubclass(cls, self.__origin__)
TypeError: issubclass() arg 1 must be a class

I assume pydantic does not work with typing_extensions. Is it possible to do unions with a discriminate property?

I was able to get the following to work but it requires different schemas:

>>> class A(BaseModel):
...   foo: str
...
>>> class B(BaseModel):
...   bar: str
...
>>> class C(BaseModel):
...   results: Union[A,B]
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in C
NameError: name 'Union' is not defined
>>> from typing import Union
>>> class C(BaseModel):
...   results: Union[A,B]
...
>>> C.parse_obj({"results":{"bar":"foo"}})
<C results=<B bar='foo'>>
>>> C.parse_obj({"results":{"foo":"foo"}})
<C results=<A foo='foo'>>

Without support for a discriminant like in this example we can never reach alter union members:

>>> class A(BaseModel):
...   foo: str
...
>>> class B(BaseModel):
...   foo: str
...
>>> class C(BaseModel):
...   xs:
  File "<stdin>", line 2
    xs:
       ^
SyntaxError: invalid syntax
>>> from typing import Union
>>> class C(BaseModel):
...   xs: Union[A,B]
...
>>> C.parse_obj({foo:"bar"})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'foo' is not defined
>>> C.parse_obj({"xs": {"foo":"bar"}})
<C xs=<A foo='bar'>>
jasonkuhrt commented 5 years ago

Just to help clarify the use-case, I am using FastAPI and consider this endpoint:

class SegmentClientAccount(BaseModel):
    name: str
    exists: bool

class DialogueClientAccount(BaseModel):
    name: str
    exists: bool
    scopes: List[str] = []

class DesiredState(BaseModel):
    secret_resource_name: str = "app-secrets-by-locksmith"
    accounts: List[Union[DialogueClientAccount, SegmentClientAccount]] = []

class SyncResponse:
    status: str

@app.post("/sync", response_model=DesiredState)
def handle_sync(desired_state: DesiredState):
    return desired_state

when http client sends:

curl localhost:8000/sync --data '{ "accounts": [{ "name": "foobar", "exists": true }]}'

How is Pydantic going to figure out if that is SegmentClientAccount or DialogueClientAccount 😕 It cannot.. It needs e.g.:

curl localhost:8000/sync --data '{ "accounts": [{ "kind": "dialogue_client", "name": "foobar", "exists": true }]}'
samuelcolvin commented 5 years ago

I think what you're looking for is Schema(..., const=True).

See #469 and referenced issues for discussion. Let me know if that doesn't fix it.

jasonkuhrt commented 5 years ago

Thanks that looks promising, will give it a try. An example showing how to specify what the const value is would be helpful, will post back with that once I find it, so we can maybe add it to the docs

samuelcolvin commented 5 years ago

I entirely agree an example in docs would be useful here.

You might look at #503 which is talking about the same thing I think.

jasonkuhrt commented 5 years ago

Cool, thanks again, lots of good reading there.

It looks like an area where pydantic might expand/improve in the future? If the Literal type became standard to the typing library I think it would provide an optimal DX/soundness to at least a category of use-cases.

Assuming Literal never did reach the standard library, is there any prospect of pydantic extending support for types within typing_extensions?

samuelcolvin commented 5 years ago

What's the difference between Literal and Final? #502

If there's one agreed way of defining this I'd be happy to support it (and maybe even provide a proxy for it in pydantic if that helps).

But currently I'm not clear what (if any) approach has been anointed as "right" by python.

dmontagu commented 5 years ago

@samuelcolvin My understanding is that Final has more to do with reassignment/mutability/overriding, whereas Literal is just a way to produce an annotation that is only considered the type of one (or more) specific literal value(s).

This is described in depth in the mypy docs for Literal and Final.

Both are potentially useful in pydantic -- Final could be used as a way to annotate that reassignment is not allowed, whereas Literal could be used to indicate that only specific values are allowed.

In particular, given the way pydantic dataclasses work (where assigning a value is actually just setting a default), I would expect the following behavior:

class FinalModel(BaseModel):
    x: Final[float] = 3000

class LiteralModel(BaseModel):
    x: Literal[3000] = 3000

final_model = FinalModel(x=4000)  # okay
final_model.x = 3000 # ValidationError due to Final
literal_model = LiteralModel(x=4000)  # ValidationError due to Literal

However, in the mypy docs for Literal, it currently says:

Literal is an officially supported feature, but is highly experimental and should be considered to be in alpha stage. It is very likely that future releases of mypy will modify the behavior of literal types, either by adding new features or by tuning or removing problematic ones.

So perhaps it may be better for maintenance to just wait until this is more established.

jasonkuhrt commented 5 years ago

@dmontagu thanks for chiming in

whereas Literal could be used to indicate that only specific values are allowed.

if pydantic cannot currently do this, then I don't see how it can support union types diverging over a discriminant (but, I still have to find time to go read those threads @samuelcolvin).

potentially useful in pydantic

If we're in agreement about this, I would appreciate if keeping an issue open for transparency and tracking

dmontagu commented 5 years ago

@jasonkuhrt

if pydantic cannot currently do this, then I don't see how it can support union types diverging over a discriminant

If I understand correctly, the issue comes down to the difference between enforcing a type vs. enforcing a value. I would expect the union type with discriminant to work, e.g., for three subclasses of BaseModel (Union[Model1, Model2, Model3]), but Union[1, 2, 3] just doesn't even make sense since 1, 2, and 3 are not types. But Union[Literal[1], Literal[2], Literal[3]] (which is equivalent to Literal[1, 2, 3]) does make sense. I don't know what pydantic would do with it currently.

jasonkuhrt commented 5 years ago

I would expect the union type with discriminant to work, e.g., for three subclasses of BaseModel (Union[Model1, Model2, Model3])

@dmontagu Can you show me a minimal example?

dmontagu commented 5 years ago

@jasonkuhrt Sorry, I think I phrased that confusingly/wrong. I just meant that if you had three classes with incompatible property types, you could use Union and it would work currently (no "discriminant" necessary). But you can't union values (what it looks like Literal is supposed to accomplish).

More explicitly, I was just saying you could do this:

from pydantic import BaseModel, Union

class A(BaseModel):
    kind: str
    foo: str

class B(BaseModel):
    kind: str
    bar: str

class C(BaseModel):
    results: Union[A, B]

print(C(results={"kind": "a", "bar": "bar"}))  # shows C results=<B kind='a' bar='bar'>
print(C(results={"kind": "b", "foo": "foo"}))  # shows C results=<A kind='b' foo='foo'>

and that support for Literal could be added to support the case where there was a check on kind even if bar was renamed to foo in B (but this doesn't exist yet). The points I was making above were proposals for how to handle Literal vs. Final, rather than an attempt to explain how you can currently use pydantic to accomplish this goal.

dmontagu commented 5 years ago

@jasonkuhrt if you want to be able to use a "kind" parameter to discriminate types, I think you might find the following code to be a useful starting point

from typing import ClassVar, Dict, List, Optional

from pydantic import Any, BaseModel, Union, validator

class KindModel(BaseModel):
    allowed_kinds: ClassVar[Union[None, str, List[str]]] = None
    kind: str

    @validator("kind", check_fields=False)
    def validate_kind(cls, v: Any, *, values: Dict[str, Any], **kwargs: Any) -> Optional[str]:
        if cls.allowed_kinds is None:
            return v
        elif isinstance(cls.allowed_kinds, list):
            if v not in cls.allowed_kinds:
                raise ValueError(f"kind not in {cls.allowed_kinds!r}")
        elif v != cls.allowed_kinds:
            raise ValueError(f"kind != {cls.allowed_kinds!r}")
        return v

class KindAModel(KindModel):
    allowed_kinds: ClassVar[str] = "a"

class KindBModel(KindModel):
    allowed_kinds: ClassVar[str] = "b"

class KindABModel(KindModel):
    allowed_kinds: ClassVar[str] = ["a", "b"]

class ExampleModel(BaseModel):
    cast: Union[KindAModel, KindBModel]
    union: KindABModel
    any_: KindModel

# ##### Demonstration #####

aaa_parent = ExampleModel(cast={"kind": "a"}, union={"kind": "a"}, any_={"kind": "a"})
print(aaa_parent)
# ParentModel cast=<KindAModel kind='a'> union=<KindABModel kind='a'> any_=<KindModel kind='a'>
print(aaa_parent.dict())
# {'cast': {'kind': 'a'}, 'union': {'kind': 'a'}, 'any_': {'kind': 'a'}}

bbb_parent = ExampleModel(cast={"kind": "b"}, union={"kind": "b"}, any_={"kind": "b"})
print(bbb_parent)
# ParentModel cast=<KindBModel kind='b'> union=<KindABModel kind='b'> any_=<KindModel kind='b'>

abc_parent = ExampleModel(cast={"kind": "a"}, union={"kind": "b"}, any_={"kind": "c"})
print(abc_parent)
# ParentModel cast=<KindAModel kind='a'> union=<KindABModel kind='b'> any_=<KindModel kind='c'>

ccc_parent = ExampleModel(cast={"kind": "c"}, union={"kind": "c"}, any_={"kind": "c"})
"""
Error output from previous line: 

pydantic.error_wrappers.ValidationError: 3 validation errors
cast -> kind
  kind != 'a' (type=value_error)
cast -> kind
  kind != 'b' (type=value_error)
union -> kind
  kind not in ['a', 'b'] (type=value_error)
"""

@samuelcolvin I think it would be awesome if the above class declarations could be replaced with this:

class KindModel(BaseModel):
    kind: str

class KindAModel(BaseModel):
    kind: Literal["a"]

class KindBModel(BaseModel):
    kind: Literal["b"]

class KindABModel(BaseModel):
    kind: Literal["a", "b"]

(that's how I'm thinking of using Literal). I'd also be happy with more of a pydantic-specific syntax if we want to avoid using typing_extensions.Literal due to possible implementation changes.

(I'd be happy to put in the effort to implement Literal integration if it was of interest; I just don't want to work on it if it's definitely not going to be accepted for broader reasons.)

jasonkuhrt commented 5 years ago

But you can't union values (what it looks like Literal is supposed to accomplish).

@dmontagu right but you also cannot union this (and this particular is my current goal):

from pydantic import BaseModel, Union

class A(BaseModel):
    kind: str
    foo: str = "yolo"

class B(BaseModel):
    kind: str
    bar: str = "rolo"

class C(BaseModel):
    results: Union[A, B]

print(C(results={"kind": "a"}))  # shows C results=<A kind='a' foo='yolo'>
print(C(results={"kind": "b"}))  # shows C results=<A kind='b' foo='yolo'>

But you clearly know that since your next comment shows a work around to achieve it :D thank you!

Unfortunately that work around is unacceptable as I'm working in the context of a public-facing api (https://github.com/tiangolo/fastapi).

@samuelcolvin can we please reopen this issue? It doesn't seem resolved to me.

jasonkuhrt commented 5 years ago

@dmontagu agree this seems ideal:

class B(BaseModel):
    kind: Literal["a"]

class A(BaseModel):
    kind: Literal["b"]

class SearchResults(BaseModel):
    items: List[Union[A, B]]
dmontagu commented 5 years ago

@jasonkuhrt What about having a public-facing API conflicts with the workaround I provided? (I'm also using FastAPI heavily these days.) I don't see why the above workaround doesn't solve what you listed as your current goal. Here it is adapted to the code you provided:

from typing import Any, ClassVar, Dict

from pydantic import BaseModel, Union, validator

class BaseKind(BaseModel):
    required_kind: ClassVar[Optional[str]] = None
    kind: str

    @validator("kind", check_fields=False)
    def validate_kind(cls, v: Any, *, values: Dict[str, Any], **kwargs: Any) -> str:
        if cls.required_kind is None:
            return v
        elif v != cls.required_kind:
            raise ValueError(f"kind != {cls.required_kind!r}")
        return v

class A(BaseKind):
    required_kind: ClassVar[str] = "a"
    foo: str = "yolo"

class B(BaseKind):
    required_kind: ClassVar[str] = "b"
    bar: str = "rolo"

class C(BaseModel):
    results: Union[A, B]

print(C(results={"kind": "a"}))  # shows C results=<A kind='a' foo='yolo'>
print(C(results={"kind": "b"}))  # shows C results=<B kind='b' bar='rolo'>
print(C(results={"kind": "c"}))  # ValidationError

Note that C parses the kind "a" vs kind "b" properly.

dmontagu commented 5 years ago

@jasonkuhrt It occurs to me that by "in the context of a public-facing api" you might mean that you would like it to be auto-documented properly. Is that right? I would be interested to know what specifically is the shortcoming.

samuelcolvin commented 5 years ago

as i said right at the beginning this is possible right now:

from pydantic import BaseModel, Union, Schema

class A(BaseModel):
    kind: str
    foo: str = Schema('yolo', const=True)

class B(BaseModel):
    kind: str
    bar: str = Schema('rolo', const=True)

class C(BaseModel):
    results: Union[A, B]

print(C(results={"kind": "a", 'foo': 'yolo'}))  # shows C results=<A kind='a' foo='yolo'>
#> C results=<A kind='a' foo='yolo'>
print(C(results={"kind": "b", 'foo': 'rolo'}))  # shows C results=<A kind='b' foo='yolo'>
#> C results=<B kind='b' bar='rolo'>

I'll re-open this issue to support Literal and create a new issue to better document const


Summary - Let's support Literal

Either via an implementation inside pydantic which can eventually give way to a standard library implementation or via support typing typing_extensions.

jasonkuhrt commented 5 years ago

@samuelcolvin many thanks, missed that sorry 🤦‍♂

benediktbrandt commented 5 years ago

First of all thanks so much for this library and your ongoing work @samuelcolvin. The code sample shared above might have an issue depending on one's use case.

print(C(results={"kind": "this", 'foo': 'does not exist'})) # shows C results=<B kind='this' bar='rolo'>

if one expects an exception to be thrown instead, I would recommend using:

from pydantic import BaseModel, Union, Schema

class A(BaseModel):
    type: str = Schema('a', const=True)
    metadata: str 

class B(BaseModel):
    type: str = Schema('b', const=True)
    metadata: str 

class C(BaseModel):
    results: Union[A, B]

print(C(results={"type": "a", 'metadata': '1'}))  
print(C(results={"type": "a", 'metadata': '2'}))  
print(C(results={"type": "b", 'metadata': '3'})) 
print(C(results={"type": "b", 'metadata': '4'})) 
print(C(results={"type": "c", 'metadata': '4'}))  # throws an exception