deis / workflow

The open source PaaS for Kubernetes.
https://deis.com/workflow/
MIT License
1.3k stars 181 forks source link

Workflow Integration for Kubernetes Jobs and Cron Jobs #652

Open felixbuenemann opened 7 years ago

felixbuenemann commented 7 years ago

It would be great if Deis Workflow provided a ways to trigger one-off Kubernetes Jobs and scheduled Cron Jobs for batch processing.

There should be a deis job command similar to deis run which can trigger a custom command in the app to be run, but this should not be interactive (don't wait for completion in the cli) and not be limited by the 20 minute kill timeout. In addition these tasks should be started with the currently deployed release, but not be killed, if a new release is deployed.

If this feature were present, it could then be further extended to Kubernetes Cron Jobs, which would provide a reliable way to schedule recurring jobs. While it is possible to deploy schedulers as their own apps or proc types this is best left to the cluster.

Cron jobs could then be scheduled using a command like deis cron:add <name> <cron> <command> eg. deis cron:add daily_import 30 5 * * * rake import:prices import:inventory. Deis would then create and update K8s Cron Jobs to trigger these commands with the current app release and environment.

mmacvicar commented 7 years ago

+1

bacongobbler commented 7 years ago

The current idea is that we want to do a deis run --detach which is essentially running a job without attaching to it/waiting for a status code. That should solve the deis job use case without introducing another command to the CLI.

Cron jobs could then be scheduled using a command like deis cron:add eg. deis cron:add daily_import 30 5 * rake import:prices import:inventory. Deis would then create and update K8s Cron Jobs to trigger these commands with the current app release and environment.

I think we've deferred this feature because users handle this with a custom process type that runs a command at certain intervals, i.e.

web: my_process
timer: while true; do run_my_job; sleep 1h; done

And that usually satisfies this use case. But, if you feel like taking a crack at a PR we would be happy to take a look at it.

felixbuenemann commented 7 years ago

Unfortunately scheduling jobs is usually much more involved then a simple sleep loop.

For one thing you need to express something like "run this job every 30 minutes, but only on work days during work hours".

Next you need to make sure there's only one scheduler running (this is likely to cause trouble during deployments). It's also easy to loose scheduled invocations, because the app was doing a rolling deployment just when it was supposed to schedule a job.

Then it get's more complicated, if you need to ensure that only one job of certain kind is run at a time and you start taking a distributed lock on the database to ensure that.

Kubernets Cron Jobs solve all those issues and have a cluster wide view of what's going on.

While it's not impossible to build this yourself, it's certainly not easy to do it right.

nrwiersma commented 7 years ago

Empire solves this issue by using an extended Procfile. Perhaps this helps.

bacongobbler commented 7 years ago

This would be a great PR if someone wishes to tackle this.

maccman commented 7 years ago

We'd be happy to sponsor dev

bacongobbler commented 7 years ago

Feel free to hit us up the community slack channel if you need help getting started! We can help with the dev workflow better there than over a ticket. :)

maccman commented 7 years ago

Oh I mean we'd be happy to sponsor dev with 💰 - we've unfortunately no one with python experience.

Cryptophobia commented 6 years ago

This issue was moved to teamhephy/workflow#45