spotify / luigi

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Apache License 2.0
17.71k stars 2.39k forks source link

Add option to remove event handler from luigi.Task #3282

Closed starhel closed 5 months ago

starhel commented 6 months ago

Description

I'd like to introduce luigi.Task.remove_event_handler method to support removing one of registered callbacks. By invoking this method with the appropriate parameters (event and callback), the specified event handler is removed from the internal registry.

Motivation and Context

I have many tests in which I need to collect all exceptions from a graph. Since luigi.build does not return a list of exceptions, I've discovered that adding an event handler is the simplest way to achieve this. However, there's currently no option to remove the handler in the teardown. As a workaround, I must maintain a hack with a private field of the Task class for now.

@pytest.fixture
def luigi_exceptions():
    capture = LuigiExceptionCapture()  # object to collect exceptions which works with multiprocessing

    @luigi.Task.event_handler(luigi.Event.FAILURE)
    def failure_handler(task, ext):
        capture.add(task, ext)

    yield capture

    luigi.Task._event_callbacks[luigi.Task][luigi.Event.FAILURE].remove(failure_handler)

Have you tested this? If so, how?

Above fixture is utilized in my projects, and I've also added a unittest. :)

starhel commented 5 months ago

Hi @dlstadther, just wanted to ask if you could spare a moment to review and merge this PR. If you need any additional information or have any questions about the code, please don't hesitate to reach out.