PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
15.32k stars 1.5k forks source link

Conflicts with pydantic and pendulum, depending on import order #14377

Closed jongbinjung closed 3 days ago

jongbinjung commented 4 days ago

First check

Bug summary

If I use a pydantic V2 model with one of the fields being a pendulum.DateTime, the flow fails at runtime with a

TypeError: validate() takes {x} positional arguments but {x + 1} were given

but only if prefect is imported before the model definition.

Reproduction

import pendulum
import pydantic

import prefect

class MyDate(pydantic.BaseModel):
    model_config = pydantic.ConfigDict(arbitrary_types_allowed=True)
    run_date: pendulum.DateTime

MyDate(run_date=pendulum.now())

Error

TypeError                                 Traceback (most recent call last)
Cell In[1], line 11
      8     model_config = pydantic.ConfigDict(arbitrary_types_allowed=True)
      9     run_date: pendulum.DateTime
---> 11 MyDate(run_date=pendulum.now())

File ~/workspace/medely-prefect/flows/facility_churn_model/.venv/lib/python3.11/site-packages/pydantic/main.py:177, in BaseModel.__init__(self, **data)
    175 # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
    176 __tracebackhide__ = True
--> 177 self.__pydantic_validator__.validate_python(data, self_instance=self)

TypeError: validate() takes 2 positional arguments but 3 were given

Versions (prefect version output)

$ prefect version
Version:             2.19.6
API version:         0.8.4
Python version:      3.11.2
Git commit:          9d938fe7
Built:               Mon, Jun 24, 2024 10:23 AM
OS/Arch:             darwin/arm64
Profile:             dev
Server type:         server

Additional context

For example:

import pendulum
import pydantic

import prefect

class MyDate(pydantic.BaseModel):
    model_config = pydantic.ConfigDict(arbitrary_types_allowed=True)
    run_date: pendulum.DateTime

MyDate(run_date=pendulum.now())

fails with

TypeError                                 Traceback (most recent call last)
Cell In[1], line 11
      8     model_config = pydantic.ConfigDict(arbitrary_types_allowed=True)
      9     run_date: pendulum.DateTime
---> 11 MyDate(run_date=pendulum.now())

File ~/workspace/medely-prefect/flows/facility_churn_model/.venv/lib/python3.11/site-packages/pydantic/main.py:177, in BaseModel.__init__(self, **data)
    175 # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
    176 __tracebackhide__ = True
--> 177 self.__pydantic_validator__.validate_python(data, self_instance=self)

TypeError: validate() takes 2 positional arguments but 3 were given

but, if changed to

- import prefect

class MyDate(pydantic.BaseModel):
    model_config = pydantic.ConfigDict(arbitrary_types_allowed=True)
    run_date: pendulum.DateTime

+ import prefect

it works fine; e.g.,

>>> import pydantic
>>> class MyDate(pydantic.BaseModel):
...     model_config = pydantic.ConfigDict(arbitrary_types_allowed=True)
...     run_date: pendulum.DateTime
...
>>> import prefect
>>> MyDate(run_date=pendulum.now())
MyDate(run_date=DateTime(2024, 6, 27, 10, 0, 54, 542354, tzinfo=Timezone('America/Los_Angeles')))
>>> print(pendulum.__version__)
2.1.2
>>> print(prefect.__version__)
2.19.6
>>> print(pydantic.__version__)
2.7.4
WillRaphaelson commented 3 days ago

Thanks @jongbinjung - thats odd. I'll look into this.

WillRaphaelson commented 3 days ago

Hey @jongbinjung - so this is a consequence of some idiosyncrasies we introduced as part of supporting both pydantic 1 and 2 oin prefect 2. This will be fixed in prefect 3, which you're welcome to use while its in RC, but for the time being I think we're not going to patch this up and you should just use the import order workaround. Thanks for the issue.