timgit / pg-boss

Queueing jobs in Postgres from Node.js like a boss
MIT License
2.13k stars 158 forks source link

pg-boss vs. rabbitmq vs. redis kue, bull for job scheduling #94

Closed booboothefool closed 6 years ago

booboothefool commented 6 years ago

I've been evaluating these for a job scheduler, and I think pg-boss is the solution, but wanted to get some confirmation.

Node Schedule I first started with https://github.com/node-schedule/node-schedule. Problem: Doesn't persist and have to roll own functionality for multiple Node.js instances.

Agenda https://github.com/agenda/agenda Problem: Looks great, but screw MongoDB. Current stack uses PostgreSQL, Redis, RabbitMQ, so I didn't want to introduce another dependency.

RabbitMQ Problem: I'm able to schedule things with a plugin https://github.com/rabbitmq/rabbitmq-delayed-message-exchange, but what bugs me about RabbitMQ is the inability to cancel jobs. Say I have a jobId that is like <jobName>-<itemId> scheduled, then something happens where this job is no longer needed. I can't remove it from the queue by jobId. This is the only thing preventing me from using RabbitMQ because I am already using it for queueing everything else that doesn't have a time delay.

Redis Kue/Bull Problem: I've searched and mostly heard bad things about Kue. Apparently Bull is more stable, but I don't know enough about it. It also lacks acknowledgements. pg-boss stands out to me because it archives completed (and expired/failed?) jobs in a table. If only it had a nice monitoring UI, haha. In addition, I trust PostgreSQL for guarantees more than Redis. Currently, I am only using Redis as a cache. Thoughts?

timgit commented 6 years ago

First of all, as the author of pg-boss and since I still use it in our product, I'm a big advocate for it and you may even accuse me of bias. That disclaimer aside, I have opinions about these items, which I think is the essense of this issue.

booboothefool commented 6 years ago

I really appreciate the response and letting me bounce ideas off ya!

I am using node-schedule similarly, I think. At the moment, it schedules jobs that after some time, do an API call, and depending on the response may or may not publish a message to RabbitMQ to do other stuff, similar to your "pg-boss handles it from there". My concern with node-schedule is I'm not sure how to use it with multiple Node.js instances, assuming there would be issues, and of course, there's the case that the schedule is just lost if the server crashes.

RabbitMQ has been great so far, but I have no idea how one is supposed to 1. delay jobs with it and 2. cancel jobs with it. It seems I can at least schedule jobs with the awkward plugin, but I have cases where thousands of jobs that would make API calls can all of a sudden become irrelevant, so I'd like to cancel them by their ids or something. I may also be using RabbitMQ incorrectly and should use a database instead: https://stackoverflow.com/questions/49611402/delay-message-rabbitmq-on-nodejs

I think it's between pg-boss and Bull. Kue is unmaintained according to Automattic creators, and Bull seems pretty reliable with "at least once". https://github.com/OptimalBits/bull#important-notes I'm leaning towards pg-boss though because I really only like Redis as a cache, whereas PostgreSQL has never let me down, ever.

And here's a cool discussion: https://news.ycombinator.com/item?id=14730685

PostgreSQL => best database RabbitMQ => best message queue Redis => best cache ??? => best (distributed, guaranteed) job scheduler 😛

timgit commented 4 years ago

@booboothefool, thought I'd mention on this old issue that pg-boss now has a distributed scheduler as of version 5. We're no longer using node-schedule in our app, but instead relying on pg-boss scheduling which uses the cron-parser package