contribsys / faktory

Language-agnostic persistent background job server
https://contribsys.com/faktory/
Other
5.73k stars 227 forks source link

Is it possible to run the faktory server in a HA setup #484

Closed jessecollier closed 1 month ago

jessecollier commented 1 month ago

I'm interested in Faktory enterprise, but the wiki suggests that a remote redis DB will not support having multiple Faktory instances. I also do not see any docs (including ECS and k8s) about running more than one instance of the faktory server.

Given that, I assume the failover strategy in a production environment relies on the rescheduling of the faktory server instance. Which this also implies downtime for both scheduling new jobs and processing jobs while the faktory server is being replaced in the event of failover.

Are there any supported strategies that involve being able to support running multiple instances in a primary/secondary fashion? Or even running multiple faktory in a cluster setup where a quorum is established and only one "primary" is active at any time?

If this is not the right forum for asking this question, I'm happy to contact support.

mperham commented 1 month ago

The only supported HA functionality is the ability to provide a REDIS_URL so Faktory Enterprise can use a managed HA Redis from a SaaS. Faktory itself does not have a clustered or HA mode and to build one would be so disruptive to the protocol that I'm not sure it's possible to do so and maintain compatibility.

I'm also a believer that for many use cases, a simple SPOF can be more reliable than a complex HA solution with many more moving parts. "KISS" is always good advice.

jessecollier commented 1 month ago

I can understand the challenges for making faktory HA, but in our case our job publishers would need high availability (99.99% sla) to be able to publish jobs. One of our criterion is to ensure every job has at least been published regardless if the backend async has been published.

If in the future faktory decides to add this we might be able to reconsider.

Thanks for taking the time to respond!