PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
15.95k stars 1.57k forks source link

The schedule behaviour when using .deploy() is different from .build_from_flow() #11723

Open rudeb0y opened 8 months ago

rudeb0y commented 8 months ago

First check

Bug summary

I refer to the deploy docs:

schedule: A schedule object defining when to execute runs of this deployment. Used to define additional scheduling options like timezone.

is_schedule_active: Whether or not to set the schedule for this deployment as active. If not provided when creating a deployment, the schedule will be set as active. If not provided when updating a deployment, the schedule's activation will not be changed.

This is not the case.

Reproduction

The repro requires inspection in the ui. A work pool is required.

Given the script:

from prefect import flow
from prefect.deployments import Deployment

@flow
def my_flow():
    pass

agent_deploy = Deployment.build_from_flow(
    flow=my_flow,
    name="via_build_from_flow",
)

if __name__ == "__main__":
    agent_deploy.apply()
    my_flow.deploy(
        name="via_deploy",
        work_pool_name="ecs-default",
        image="my-flow", build=False, push=False
    )

UI after running script, applying a schedule and disabling:

image

and after running the script a 2nd time..

image

Error

the build_from_flow method respected the values set in the UI which is consistent with what the docs state.

The deploy method does not. Schedule was removed and deployment is active.

Versions

Version:             2.14.16
API version:         0.8.4
Python version:      3.11.6
Git commit:          a0c2b69f
Built:               Thu, Jan 18, 2024 5:35 PM
OS/Arch:             darwin/x86_64
Profile:             staging
Server type:         cloud

Additional context

I also observed that between runs of the script, the flow_id, deployment_id and deployment_version are not changing, suggesting that it is updating (as expected) and not re-creating.

desertaxle commented 8 months ago

Thanks for the issue @rudeb0y! It looks like we need to ensure that we aren't setting the schedule or changing its active status when a schedule or is_schedule_active is not provided. Does that align with your expectations?

rudeb0y commented 8 months ago

Indeed that is the expectation. Not supplied = no effect on current settings. Only edge case probably is that on_create then the deployment is "on".

In Prefect 1.x I believe that if a schedule was set in code, the UI would warn (or perhaps not allow modification) if you tried to edit it in the UI. Not required but might be worth considering

ChrisPaul33 commented 8 months ago

The same goes for the parameters for the flow. If I don't provide any overrides for the parameters I would like to ensure the old ones are kept the same. Maybe adding a flag to .deploy for such cases could be a nice solution 😃

rudeb0y commented 7 months ago

@ChrisPaul33 I don't think the build_from_flow() method works that way for parameters... I think that would override intentionally. That is something that should be code-first - you want the settings for a deployment always in code. The single source of truth for "how" the code is run, rather than the "when".

What I am suggesting is for .deploy() behaves the same as build_from_flow() re: scheduling

However I appreciate the point - if not supplied don't change is a good way to think about things..

However if a user changes something that IS set in code.. then it's going to be overriden on next deploy.