dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.96k stars 1.5k forks source link

run-level resource cleanup / lifecycle hooks #12707

Open sryza opened 1 year ago

sryza commented 1 year ago

It would be useful to be able to run some cleanup code at the end of a run. Examples:

I think it would likely make sense to tie these to resources.

Relevant requests:

MSigno commented 1 year ago

Hi - any progress on this issue?

eloreaux commented 1 year ago

Any progress on this? Would like to delete a tmp folder after run is complete

judahrand commented 1 year ago

Is there anyway to do this currently? Any workarounds?

sryza commented 1 year ago

@judahrand the closest that we have is op-level hooks, but they don't provide the full functionality discussed here. https://docs.dagster.io/concepts/ops-jobs-graphs/op-hooks

Here's a way that one integration has hacked around this: https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-mlflow/dagster_mlflow/hooks.py#L33

lintonye commented 9 months ago

Isn't this what's requested? https://docs.dagster.io/concepts/resources#lifecycle-hooks

When a resource is initialized during a Dagster run, the setup_for_execution method is called. This method is passed an InitResourceContext object, which contains the resource's config and other run information. The resource can use this context to initialize any state it needs for the duration of the run.

Once a resource is no longer needed, the teardown_after_execution method is called. This method is passed the same context object as setup_for_execution. This method can be useful for cleaning up any state that was initialized in setup_for_execution.

setup_for_execution and teardown_after_execution are each called once per run, per process.