Nike-Inc / brickflow

Pythonic Programming Framework to orchestrate jobs in Databricks Workflow
https://engineering.nike.com/brickflow/
Apache License 2.0
183 stars 36 forks source link

[FEATURE] Continuous workflow schedule support #120

Open jeroomvanbever-juvo opened 3 months ago

jeroomvanbever-juvo commented 3 months ago

Is your feature request related to a problem? Please describe. In databricks workflows can be scheduled or triggered. In most case batch workflows are scheduled based on quartz cron. But databricks also supports the so called "continuous" schedule (do not mistake with every second cron schedule, where in the UI it is also mentioned as continuous). This schedule is useful in case of streaming jobs, like continuously reading from kafka. This is also not to be mistaken with the trigger been set at streamwriter, as this is a spark streaming feature. When a databricks workflow is having the schedule continuous it will make sure that any time a single job run is alive. Suppose you have a streaming workflow, but the cluster fails. Then this workflow setting will make sure the workflow is automatically restarted. If then fails again, it will apply an exponential backoff restarting mechanism, until the schedule is been pauzed. Note: this feature is only available for workflows having only 1 task.

For the moment it is not yet possible to set this special "continuous" schedule using brickflow.

Cloud Information

Describe the solution you'd like An extra property in the Workflow class called "continuous", boolean that can be set exclusive next to schedule_quartz_expression. (or alternatively accept the string "continuous" as a quartz expression and make the needed translation). This then needs to be casted into the proper bundle property for a workflow.

Describe alternatives you've considered We tried to set the quartz expression to every second * (showing "continuous" in the databricks UI), but this will fire a new run every second and skipping it as there is already one running. Doesn't have the exponential backoff as well.

Additional context Assumed look in the bundle.json:

resources:
  jobs:
    <workflow_name>:
      name: ...
      continuous:
        pause_status: UNPAUSED
pariksheet commented 1 month ago

PR - https://github.com/Nike-Inc/brickflow/pull/138