Netflix / metaflow

Open Source Platform for developing, scaling and deploying serious ML, AI, and data science systems
https://metaflow.org
Apache License 2.0
8.26k stars 774 forks source link

Notification via @notify #443

Closed talebzeghmi closed 3 months ago

talebzeghmi commented 3 years ago

Use Case: Be notified on Flow failure, say a daily Flow, or on success for notification.

A @notify() decorator that has Metaflow send a notification upon success or failure, per Flow or per Step. It could send email or slack messages, and be extensible for other notification systems (SNS, etc).

It would be up to the scheduler (local, AWS Step Functions, KFP) to honor the @notify_base (or @notify_flow) decorator.

@notify(email_address=“oncall@foo.com", on="failure")
@notify(email_address=“ai@foo.com", on="success")
class MyFlow(Flow):
   @notify(slack_channel=“#foo", on="success")
   @step
   def start(self):
     pass
savingoyal commented 3 years ago

Another option could be to first begin with

python flow.py step-functions create --notify-on-error foo+failure@bar.com --notify-on-success foo+success@bar.com

This is the pattern that we follow with meson internally.

Implementing notify-on-failure for local metaflow execution will be a bit tricky (ctrl-c etc.).

crk-codaio commented 3 years ago

How do you propose @notify runs for every task? Isn't the scheduler responsible for the same (similar to @notify_flow) or do you plan to edit the entry point (say in @batch) to run after metaflow step foo or do you intend to run @notify logic in a separate container after the task is over?

+1 to what Savin said since notify-on-error will be optimistic (what if the container running notify itself fails - even on success?) depending on your answers to my aforementioned questions.

jaskaran-virdi-imprivata commented 9 months ago

For a general case, lets say, after a metaflow job completes(success or failure), I would like to return the flow of control to a microservice that had triggered the metaflow job. The metaflow documentation talks about triggering workflows by external events but didn't see anything about completion handlers. How does one implement that?

savingoyal commented 3 months ago

python flow.py argo-workflows create --notify-on-error foo+failure@bar.com --notify-on-success foo+success@bar.com has been available for a while now, so perhaps this issue can be closed.

@jaskaran-virdi-imprivata - One mechanism would be to use the @catch functionality and in the end step - depending on whether an error was caught or not, you can execute the desired bit of code. Yet another mechanism would be to use the runner/deployer API to monitor the status of execution and execute the flow control in a process outside of the flow. Happy to chat further at chat.metaflow.org