PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
15.94k stars 1.56k forks source link

dataclass validation error with pydantic version >= 2.3 #11360

Open Duncan-Hunter opened 9 months ago

Duncan-Hunter commented 9 months ago

First check

Bug summary

I'm attempting to use a dataclass as an argument to a prefect flow. When installing in a clean environment (I've been using poetry) with just poetry add prefect I get prefect version 2.14.9, and Pydantic version:

[[package]]
name = "pydantic"
version = "2.5.2"
description = "Data validation using Python type hints"
optional = false
python-versions = ">=3.7"
files = [
    {file = "pydantic-2.5.2-py3-none-any.whl", hash = "sha256:80c50fb8e3dcecfddae1adbcc00ec5822918490c99ab31f6cf6140ca1c1429f0"},
    {file = "pydantic-2.5.2.tar.gz", hash = "sha256:ff177ba64c6faf73d7afa2e8cad38fd456c0dbe01c9954e71038001cd15a6edd"},
]

When I run the test code below, I get an error.

However, if I specify pydantic[email]=2.2.1, the test case runs fine (Additional context).

Apologies if: this is a problem with Pydantic; it's not reproducible; I'm not supposed to use prefect like this.

Reproduction

import prefect
from dataclasses import dataclass

@dataclass
class TestClass:
    x: int
    y: int

@prefect.flow(name="multiply list")
def multiply(cls: TestClass):
    return cls.x * cls.y

if __name__ == "__main__":
    test = TestClass(x=1, y=2)
    result = multiply(test)
    print(result)

Error

$poetry run python test.py 
11:14:45.678 | INFO    | prefect.engine - Created flow run 'athletic-narwhal' for flow 'multiply list'
11:14:45.681 | INFO    | Flow run 'athletic-narwhal' - View at http://127.0.0.1:4200/flow-runs/flow-run/31914390-5929-405e-931b-6bd3e2b3f7f8
11:14:45.681 | ERROR   | Flow run 'athletic-narwhal' - Validation of flow parameters failed with error: TypeError: _validate_dataclass() takes 2 positional arguments but 3 were given
11:14:45.684 | INFO    | prefect.engine - Flow run 'athletic-narwhal' received invalid parameters and is marked as failed.
Traceback (most recent call last):
  File "test.py", line 18, in <module>
    multiply(test)
  File "/home/dunchunter9/repos/playground-prefect/.venv/lib/python3.8/site-packages/prefect/flows.py", line 1097, in __call__
    return enter_flow_run_engine_from_flow_call(
  File "/home/dunchunter9/repos/playground-prefect/.venv/lib/python3.8/site-packages/prefect/engine.py", line 283, in enter_flow_run_engine_from_flow_call
    retval = from_sync.wait_for_call_in_loop_thread(
  File "/home/dunchunter9/repos/playground-prefect/.venv/lib/python3.8/site-packages/prefect/_internal/concurrency/api.py", line 243, in wait_for_call_in_loop_thread
    return call.result()
  File "/home/dunchunter9/repos/playground-prefect/.venv/lib/python3.8/site-packages/prefect/_internal/concurrency/calls.py", line 284, in result
    return self.future.result(timeout=timeout)
  File "/home/dunchunter9/repos/playground-prefect/.venv/lib/python3.8/site-packages/prefect/_internal/concurrency/calls.py", line 168, in result
    return self.__get_result()
  File "/home/dunchunter9/.pyenv/versions/3.8.2/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
    raise self._exception
  File "/home/dunchunter9/repos/playground-prefect/.venv/lib/python3.8/site-packages/prefect/_internal/concurrency/calls.py", line 355, in _run_async
    result = await coro
  File "/home/dunchunter9/repos/playground-prefect/.venv/lib/python3.8/site-packages/prefect/client/utilities.py", line 51, in with_injected_client
    return await fn(*args, **kwargs)
  File "/home/dunchunter9/repos/playground-prefect/.venv/lib/python3.8/site-packages/prefect/engine.py", line 386, in create_then_begin_flow_run
    return await state.result(fetch=True)
  File "/home/dunchunter9/repos/playground-prefect/.venv/lib/python3.8/site-packages/prefect/states.py", line 91, in _get_state_result
    raise await get_state_exception(state)
  File "/home/dunchunter9/repos/playground-prefect/.venv/lib/python3.8/site-packages/prefect/engine.py", line 343, in create_then_begin_flow_run
    parameters = flow.validate_parameters(parameters)
  File "/home/dunchunter9/repos/playground-prefect/.venv/lib/python3.8/site-packages/prefect/flows.py", line 512, in validate_parameters
    model = validated_fn.init_model_instance(*args, **kwargs)
  File "/home/dunchunter9/repos/playground-prefect/.venv/lib/python3.8/site-packages/pydantic/v1/decorator.py", line 130, in init_model_instance
    return self.model(**values)
  File "/home/dunchunter9/repos/playground-prefect/.venv/lib/python3.8/site-packages/pydantic/main.py", line 164, in __init__
    __pydantic_self__.__pydantic_validator__.validate_python(data, self_instance=__pydantic_self__)
TypeError: _validate_dataclass() takes 2 positional arguments but 3 were given

Versions

Version:             2.14.9
API version:         0.8.4
Python version:      3.8.2
Git commit:          4fba882c
Built:               Thu, Nov 30, 2023 2:55 PM
OS/Arch:             linux/x86_64
Profile:             default
Server type:         server

Additional context

With Pydantic version 2.2.1:

[[package]]
name = "pydantic"
version = "2.2.1"
description = "Data validation using Python type hints"
optional = false
python-versions = ">=3.7"
files = [
    {file = "pydantic-2.2.1-py3-none-any.whl", hash = "sha256:0c88bd2b63ed7a5109c75ab180d55f58f80a4b559682406812d0684d3f4b9192"},
    {file = "pydantic-2.2.1.tar.gz", hash = "sha256:31b5cada74b2320999fb2577e6df80332a200ff92e7775a52448b6b036fce24a"},
]
$poetry run python test.py 
11:31:42.194 | INFO    | prefect.engine - Created flow run 'uncovered-waxbill' for flow 'multiply list'
11:31:42.196 | INFO    | Flow run 'uncovered-waxbill' - View at http://127.0.0.1:4200/flow-runs/flow-run/f57e0e69-7718-4ef0-b90a-260de45d44af
11:31:43.152 | INFO    | Flow run 'uncovered-waxbill' - Finished in state Completed()
2
serinamarie commented 9 months ago

Hi @Duncan-Hunter, thanks for creating your first issue with Prefect! We're glad to have you in the community.

I was able to reproduce this using Python 3.8.13 and Prefect version 2.14.10.

Upgrading to Python 3.10.4, I was no longer able to reproduce the issue.

I think the solution for now is to upgrade to Python 3.10 or above, or for this particular case to pass in the flow parameter validate_parameters=False.

We should probably add a test for this case.

serinamarie commented 9 months ago

cc @urimandujano

harisonmg commented 8 months ago

Hi @serinamarie,

I was able to reproduce this issue with Python 3.11 and Prefect 2.14.16

toby-coleman commented 4 months ago

Hi @serinamarie - I too can reproduce the problem in Python 3.11, and am finding that it relates to Pydantic v1/v2 objects. For example:

from pydantic import BaseModel
from pydantic.v1 import BaseModel as BaseModelv1
from prefect import flow

class T1(BaseModelv1):
    s: str

class T2(BaseModel):
    s: str

@flow(name="test1")
def test_flow1(t: T1):
    print(t)

@flow(name="test2")
def test_flow2(t: T2):
   print(t)

@flow(name="test3", validate_parameters=False)
def test_flow3(t: T1):
   print(t)

# Pydantic v1 - this fails with:
#  TypeError: BaseModel.validate() takes 2 positional arguments but 3 were given
test_flow1({"s": "Hello!"})

# Pydantic v2 - this works:
test_flow2({"s": "Hello!"})

# Pydantic v1, validate_parameters=False - this works:
test_flow3({"s": "Hello!"})

My environment:

prefect                   2.19.2
pydantic                  2.7.2