PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
16.27k stars 1.58k forks source link

"Dry Run" mode that allows program logic to run, but avoids mutating state #6500

Closed jawnsy closed 1 year ago

jawnsy commented 2 years ago

First check

Prefect Version

2.x

Describe the proposed behavior

During development and automated testing, it is often useful to be able to run code in a "simulation" or "dry run" mode, which avoids mutating external state, prints additional debug information, and mocks out responses for calls that do mutate external state.

I'm flexible on the implementation approach, but as a starting point for discussion, would propose a standardized global variable that we can use, which would default to false (meaning that we are running in "real mode") but could be set to true (dry run/simulation mode). By standardizing the variable, we can use it in Blocks as well -- for example, a Slack webhook could write out the contents of the message to logs rather than sending to Slack.

Describe the current behavior

We do not have a built-in dry run mode, so users would need to create one themselves by supplying a parameter somehow. This is inconvenient, because blocks would not respect this out-of-the-box. Creating a first-class dry run mode would encourage standardization, which improves compatibility between third-party blocks.

Example Use

We could have a flag that allows execution of flows in "dry run" mode, which would arrange things so that the running code has a prefect.context.dry_run global flag set to True. Then, users could use it in code like:

if prefect.context.dry_run:
  print(f"would have sent: {json}")
else:
  actually_do(json)

This could also be something that we use in CI -- when users change a deployment in a pull request, then run the deployment from that branch in "dry run" mode. @anna-geller proposed the syntax prefect deployment run xxx --dryrun. An alternative approach would be to use different workspaces, including a playground/pre-production workspace that does not have access to real data; both approaches may be complementary (i.e. we may want to have a dry run mode as well as support for environment labels of some sort)

Additional context

Additionally, since we provide full flexibility to generate dependency graphs dynamically, it is not possible to perform static code analysis to generate a graph in a general sense, so providing such a mode would allow us to visualize execution of flows without actually mutating external state.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 30 days with no activity. To keep this issue open remove stale label or comment.

github-actions[bot] commented 1 year ago

This issue was closed because it has been stale for 14 days with no activity. If this issue is important or you have more to add feel free to re-open it.