PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
17.64k stars 1.65k forks source link

Slow task execution and lack of fault tolerance for server connection failures #14810

Open zsio opened 4 months ago

zsio commented 4 months ago

Describe the current behavior

Describe the proposed behavior

Example Use

from prefect import task, flow, get_run_logger

@task(name="Say Hello", task_run_name="task run name")
def say_hello():
    logger = get_run_logger()
    logger.info("Hello, world----------!")
    print("hi")

@flow(name="My First Flow", flow_run_name="flow run name")
def flow():
    s = say_hello.submit()
    s.result()
    # or 
    say_hello()

flow()

This task takes over a second to complete

Additional context

No response

zzstoatzz commented 4 months ago

hi @zsio - thanks for the issue!

can you please share the output of prefect version as requested by the issue template? prefect 2.x and 3.x have different engines.

a lot has been done between 2.x and 3.x to improve performance, and we've still been working on making running tasks faster (example), as well as minimizing constant back and forth between clients and the server.

For instance, a task with a single print statement should complete in milliseconds rather than seconds

that said, in order to provide the functionality we offer, we will have some performance overhead. Happy to hear suggestions on specific places you think we may benefit from improvements

zzstoatzz commented 4 months ago

@zsio to expand on that example I linked above, if you install prefect>=3.0.0rc13 (or main) you can try out client-side orchestration which is significantly faster than previous versions.

prefect config set PREFECT_EXPERIMENTAL_ENABLE_CLIENT_SIDE_TASK_ORCHESTRATION=True

a gist example

zsio commented 3 months ago

@zzstoatzz I have followed your suggestions and set the environment variable as instructed. While the running speed has indeed improved, I have encountered a new issue. Now, the task running flowchart is no longer displayed under the flow, and instead, it prompts: "This flow run did not generate any task or subflow runs." I'm hoping you can provide some guidance or a solution to address this problem.



@task
def say_hello2(name):
    return f"Hello {name}!"

@flow()
def hello_flow():
    one = say_hello2.submit("one")
    results = one.result()    

hello_flow()
image
lucasdepetrisd commented 2 months ago

@zsio have you found any solution to this? I didn't enable that setting but after updating to the 3.0.0 release it shows the same "This flow run did not generate any task or subflow runs"

zsio commented 2 months ago

@zsio have you found any solution to this? I didn't enable that setting but after updating to the 3.0.0 release it shows the same "This flow run did not generate any task or subflow runs"您@zsio找到任何解决方案吗?我没有启用该设置,但在更新到 3.0.0 版本后,它显示相同的“此流程运行未生成任何任务或子流程运行”

@lucasdepetrisd No, I am currently considering removing prefect.