contribsys / faktory

Language-agnostic persistent background job server
https://contribsys.com/faktory/
Other
5.76k stars 230 forks source link

[idea] Allow users/clients to configure cron jobs #308

Closed alexmic closed 2 years ago

alexmic commented 4 years ago

Hello, we're considering Faktory Pro for running cron jobs but currently one needs to understand how Faktory is deployed to be able to alter the cron configuration. Having a simple UI for adding/removing cron jobs would be ideal -- like Heroku does with its Scheduler addon.

mperham commented 4 years ago

My take is that cron job configuration is no different from source code: it needs to be part of the deployment. You need to source control that config so you can see versions and changes over time, roll back to earlier versions, etc. Providing an ad hoc text box to update configuration bypasses all of that.

IOW, automated, version controlled deployments are seen as important for quality software engineering. This request goes against that practice in my view. wdyt?

scottrobertson commented 4 years ago

Agreed. Please don't allow a way to edit these via the UI. And if you do, it needs to be something we can disable.

jbielick commented 4 years ago

How do folks feel about clients being able to send the CRON declarations?

IMO there is a real use-case for "modify scheduled jobs without touching server config". Feels pretty similar to sidekiq schedule living in the app code, not the infrastructure definitions.

alexmic commented 4 years ago

I agree with the sentiment of versioning config changes and it's fair enough to want to avoid editing via the UI (although all that can be audited but it's extra work). My thinking is very much aligned with @jbielick -- there's configuration needed to operate Faktory and configuration needed to use Faktory.

For example, Faktory can be exposed as an internal service by a platform team so that other feature teams can schedule their jobs. Having to educate a team on how to operate and deploy Faktory in order to add a cron job is overhead IMO. Each team can version their cron definitions in their application code.

mperham commented 4 years ago

I can see the argument that cron jobs change frequently and are versioned with the app code to be deployed, whereas infrastructure configuration (e.g. your statsd config) is usually much more stable, you set it once and it doesn't change for years.

So when you deploy your app code, how would you expect to load the config? If we are pushing this role to the worker, will each worker package will need to implement this? Should each worker package provide a hook on boot? I'm trying to think through how this should be implemented... Ideas welcome.

jbielick commented 4 years ago

Throwing out some ideas:

  1. Every "schedule" has a name (unique identifier).

It's difficult to see how scheduled jobs are managed (and removed) without an identifier that can be used to "clear", "add to", etc.

  1. A client/worker can upsert the named schedule after connect and greeting.

Could be:

SCHEDULE myschedname [{...}, {...}]
                        ^ same format as the Config/TOML structure

Storage of scheduled items would need to be keyed off of the schedule name.

I think the two things I would prioritize is:

mperham commented 4 years ago

I'm not sure why we would need a name for each schedule? I don't want Faktory to have to treat cron jobs as distinct entities to be managed. As I envision it, any update mechanism would upload the entire TOML. Faktory would reload the set of cron jobs from that data. Remove one? Add one? Upload the new TOML.

jbielick commented 4 years ago

I was imagining a team with multiple applications and one Faktory server or several teams (each with one or more apps) that share a Faktory server. In both cases the Faktory server is just the distributed job queue / message bus. This could be because all apps communicate with each other via Faktory, or because each app is "micro"-sized and they all need a job queue so they all use faktory.

Let's say they have a node app / client, N. And a ruby app / client, R. And a go app / client, G. The apps communicate and share jobs through a single faktory server. Additionally, each app has jobs that it wants to schedule (possible pertinent to just that app).

So app N wants to schedule a daily cleanup job that it N has job code for. App R has an hourly update job to keep a cache fresh, a daily job for emailing, and others. R and G have the code for these jobs.

Team A works on app N. Team B works on apps R and G.

Team A needs to schedule a job and makes a PR in N's repo. Team B does the same thing and makes a PR in R and G's repos.

When the app / client N connects, it should be able to schedule jobs without clobbering R and G. Similarly, when R and G add scheduled jobs, it should not clobber N's.

Essentially, each app / client could declare a "schedule name" or namespace so that it could upsert its isolated schedule. When pushing the schedule, all jobs would be pushed and replace the current within that namespace or name. No individual cron managing or naming. It's the whole schedule at once, declaratively—as if the config files were being reloaded.

If the preferable approach is to have one single schedule and only let one client upsert it, naming / namespacing isn't necessary, but I think that has some awkward tradeoffs.

mperham commented 4 years ago

Faktory is not designed to be shared across apps. It is not a message queue. It can be used as such but as you’ve laid out, there are awkward edge cases like this.

On Jun 9, 2020, at 16:24, Josh Bielick notifications@github.com wrote:

 I was imagining a team with multiple applications and one Faktory server or several teams (each with one or more apps) that share a Faktory server. In both cases the Faktory server is just the distributed job queue / message bus. This could be because all apps communicate with each other via Faktory, or because each app is "micro"-sized and they all need a job queue so they all use faktory.

Let's say they have a node app / client, N. And a ruby app / client, R. And a go app / client, G. The apps communicate and share jobs through a single faktory server. Additionally, each app has jobs that it wants to schedule (possible pertinent to just that app).

So app N wants to schedule a daily cleanup job that it N has job code for. App R has an hourly update job to keep a cache fresh, a daily job for emailing, and others. R and G have the code for these jobs.

Team A works on app N. Team B works on apps R and G.

Team A needs to schedule a job and makes a PR in N's repo. Team B does the same thing and makes a PR in R and G's repos.

When the app / client N connects, it should be able to schedule jobs without clobbering R and G. Similarly, when R and G add scheduled jobs, it should not clobber N's.

Essentially, each app / client could declare a "schedule name" or namespace so that it could upsert its isolated schedule. When pushing the schedule, all jobs would be pushed and replace the current within that namespace or name. No individual cron managing or naming. It's the whole schedule at once, declaratively—as if the config files were being reloaded.

If the preferable approach is to have one single schedule and only let one client upsert it, naming / namespacing isn't necessary, but I think that has some awkward tradeoffs.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

jbielick commented 4 years ago

It's not designed to push jobs from one language and run them in another?

mperham commented 4 years ago

It is but there are cases where that could be judged as two aspects of the same app. I think of one team, one Faktory when building features and don’t try to solve one Faktory, N teams.

On Jun 9, 2020, at 16:31, Josh Bielick notifications@github.com wrote:

 It's not designed to push jobs from one language and run them in another?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

jbielick commented 4 years ago

Ok, so maybe it's a simpler issue and there's only one schedule. The monolith or "leader" app always declares it?

mperham commented 4 years ago

Exactly, Shopify had a recent blog post about their giant monolith which has helper services in different languages where necessary.

On Jun 9, 2020, at 16:40, Josh Bielick notifications@github.com wrote:

 Ok, so maybe it's a simpler issue and there's only one schedule. The monolith or "leader" app always declares it?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

alexmic commented 4 years ago

I think of one team, one Faktory when building features and don’t try to solve one Faktory, N teams.

Perhaps you should make this clear in the wiki because that would make Faktory much more expensive to use in a SOA. I think of Faktory like a database server -- I can place multiple apps on the same server, each one with their own database, but sometimes it's necessary to give one app its own server because its workload demands are very different. I don't allow apps to schedule tasks for each other though, so one app with a helper service in another language would still be maintained by the same team. Is this thinking compatible with how you see Faktory deployed?

I'm not sure why we would need a name for each schedule? I don't want Faktory to have to treat cron jobs as distinct entities to be managed. As I envision it, any update mechanism would upload the entire TOML. Faktory would reload the set of cron jobs from that data. Remove one? Add one? Upload the new TOML.

I think that's good enough for the majority of cases. Forcing the schedule in one place makes it easier to see a consolidated view of schedule changes.

So when you deploy your app code, how would you expect to load the config? If we are pushing this role to the worker, will each worker package will need to implement this? Should each worker package provide a hook on boot? I'm trying to think through how this should be implemented... Ideas welcome.

I think as a first step scheduling should be added to the Faktory API, so that client libraries can implement it. I don't find it necessary for a worker package to implement this, but it would be a welcome addition. A flag like -s /path/to/schedule.toml could work.

Just having it in the clients allows for many deployment options. For example, instead of having a monolith/leader app, one can extract the schedule into a separate cron app containing just the TOML, which can then be configured to post the schedule after a change has been deployed. That would also satisfy the one Faktory, N teams use case.

mperham commented 4 years ago

@alexmic Different teams should not share infrastructure because then it becomes a political and accounting issue. Who pays for it? Who upgrades it? What if one team makes a change which causes Faktory to die, now the other team is down too...

If a team maintains two apps, those apps can use the same Faktory but will need to account for things like a single cron schedule.

I will ponder how workers can update the schedule and any issues therein...