tjmehta / coworkers

A RabbitMQ Microservice Framework in Node.js
MIT License
610 stars 36 forks source link

Configurable clustering #19

Closed williamhogman closed 8 years ago

williamhogman commented 8 years ago

Whoops accidental post

tjmehta commented 8 years ago

No problem, did you overlook something specifically in the documentation? Or is there anything you have a question about?

williamhogman commented 8 years ago

Yeah sorta, but is more of a feature request I have some workloads that really don't need separate processes (back-office stuff) and would like to be able to run everything in a single node process. We run all the serious stuff in Elixir, but we want something light-weight for stuff that don't handle a ton of data. One process for per queue is probably fine but just feels kinda wasteful for that usecase.

At first I didn't think you could change the CPU rule and thought I had to run 40 threads for a tiny Slack posting task, so I started writing something up, leaving it in a tab, but accidentally hit CTRL+Enter.

tjmehta commented 8 years ago

The reason it is one process per queue is to: 1) isolate connections/channels used by each queue from each other 2) increase throughput

As a result, when channels or connections error it will only kill (and respawn) the single queue worker it is consuming.

You're right I think the best you can do for your situation to avoid spinning up many processes is to use COWORKERS_WORKERS_PER_QUEUE env (just set it to 1 for the min amount).

Once you limit the workers per queue to one, even if you have a few queues, multi-process coworkers should still have very low overhead (memory should be the only resource "over-consumed").

Is your box very limited and shared with many other services?

On Jan 31, 2016, at 4:08 AM, William Högman Rudenmalm notifications@github.com wrote:

Yeah sorta, but is more of a feature request I have some workloads that really don't need separate processes (back-office stuff) and would like to be able to run everything in a single node process. We run all the serious stuff in Elixir, but we want something light-weight for stuff that don't handle a ton of data. One process for per queue is probably fine but just feels kinda wasteful for that usecase.

At first I didn't think you could change the CPU rule and thought I had to run 40 threads for a tiny Slack posting task, so I started writing something up, leaving it in a tab, but accidentally hit CTRL+Enter.

— Reply to this email directly or view it on GitHub.

williamhogman commented 8 years ago

Thanks for the quick response.

I definitely buy reason number two. Although the CPUS/QUEUES formula implies that all queues contain jobs that are more or less equally expensive, the custom mode goes a long way in solving this.

With regards to reason number one, I didn't expect this to be a problem most AMQP clients I've worked with handle errors and reconnects very gracefully don't know if amqp.node is different. As far as application errors go, I think coworkers handles those very well too.

That being said, you're correct in that it won't matter, we run our stuff dockerized on dedicated hardware and have more than enough RAM.

When it comes to API design I think coworkers is by far the best library around for building AMQP workers. However, I think the tight coupling between clustering/isolation and the main detracts somewhat from the elegance of the framework.

Apologies if I come off as rude or anything, I worked a bit on project using coworkers this weekend and it was the most fun I've ever had writing an AMQP worker. So my intention here is just to offer some hopefully constructive feedback stemming from a slightly different use-case.

tjmehta commented 8 years ago

Nice feedback @williamhogman :+1:

I definitely want to continue the discussion with you as you mentioned you have some good experience w/ RabbitMQ. Feel free to respond openly and honestly; I won't take it personally :)

Connection errors: I don't have extensive experience with RabbitMQ, but my impression is that if there is a connection or channel issue it could leave your application in a bad state. In the case that your application does get into a bad state, I think it is best practice to exit (and allow a scheduler or process manager to replace the process/node).

Clustering: I actually agree with you about the clustering detracting from the elegance of the coworkers. I bundled in the clustering for convenience, but I plan out breaking it out into a separate module in the long term. I considered adding the option to scale each worker independently, but it felt like overkill at this stage. I also figured, any individual that wanted that granularity could setup their own process manager. So far, the clustering is a pretty opinionated/complex part of the project, so I am not very inclined to add that level of control yet (maybe after I break it out, in the future).

Thanks!