Import nsqd and nsqlookupd directly into qmd

pkieltyka commented 10 years ago

For simplicity, lets import (aka package) nsqd and nsqlookupd directly into qmd. This means instead of having nsqd running as a separate process, the qmd project would import "github.com/bitly/nsqd" and start up the daemon itself. As well for nsqlookupd. This is now supported by nsqd as of 0.2.28 (see https://github.com/bitly/nsq/blob/master/ChangeLog.md)

Start with go get -u github.com/bitly/nsq...

We will start nsqadmin separately.

dkua commented 10 years ago

I am not sure if this proposal would work very well with our current setup. There are some issues we'd have to think about, I'll start with nsqlookupd then nsqd.

It is important to note that the nsqd and nsqlookupd daemons are designed to operate independently, without communication or coordination between siblings. -- From http://nsq.io/overview/design.html

dkua commented 10 years ago

nsqlookupd

The whole point of nsqlookupd is to act as a directory for Consumers (workers) to find and connect to Queues (nsqd nodes) who deal with the Topics the Consumers are interested in. It lets the Consumers do their job without having to go through the complex process of trying to discover Queues.

Let's say we integrate nsqlookupd and nsqd into qmd. Then every time a qmd node goes online, it will have to try and discover every other qmd node. But through what mechanism? The original mechanism for a qmd node was to register itself with a nsqlookupd node and let that node show the way. But by integrating nsqlookupd into qmd we removed that mechanism and negated the benefits of nsqlookupd.

As a metaphor it's similar to taking a phone book. Giving every person in it a unique copy of the phone book with only their number in it. And telling them that if they want to know someone else's number they must find and ask that person.

Integrating nsqlookupd is not something I can support.

dkua commented 10 years ago

nsqd

The recommended way of configuring nsqd is to place a nsqd node next to every Producer, a 1:1 correspondence. For qmd, the Producer is the server side of it, the worker side is the Consumer. Every Producer (through a nsqd node) and every Consumer registers itself with a nsqlookupd node. This allows the Consumer and the Producer to discover one another by Topic, a n:n (many-to-many) correspondence.

Since the server side of qmd is the Producer, I can see the benefit of integrating the nsqd node into it. The major benefit being easy deployment/ops. However I have some qualms with integration.

We would essentially be reimplementing https://github.com/bitly/nsq/blob/master/apps/nsqd/nsqd.go into the server side of qmd. This is mostly fine except we would then be required to understand the inner workings of NSQ. If bitly ends up making an API-breaking change, we would have to deal with that in qmd. Right now both the server and worker side depend on the client library (https://github.com/bitly/nsq/blob/master/apps/nsqd/nsqd.go) to communicate with NSQ. Barring API-breaking changes in the client library, we are not affected by most changes in the internal NSQ API. We would only be affected by changes in the NSQ protocol (http://nsq.io/clients/tcp_protocol_spec.html), which is a given and something we have to accept.
The nsqd node deals with everything to do with queueing. If we integrate then we lose some of that functionality. Mainly, if integrated then whenever the qmd node goes down for whatever reason so would the nsqd node. All enqueued requests in that node would be lost. There would be no requeueing, no sending to an available worker, etc.

Unlike nsqlookupd I am not totally against integrating nsqd however I still don't think it's worth the increased technical debt and loss of functionality. Integration makes more sense if:

the server side was a separate entity from the worker, nsqd could be integrated with it's Producer
we depended on NSQ purely as a message bus and not a task queue

pkieltyka commented 10 years ago

The idea is that since qmd is completely dependent on a nsqd instance, we embed the server layer right into qmd so it doesn't require an external service to operate. The embedded nsq queueing system would be strictly for qmd usage, and other qmd servers. So, it would also make sense to run the embedded nsqd on a different port. Say the default tcp port for nsqd is 4500 (? I forget), then make ours 5500. This would then remove the configuration parameter to the QueueAddr making it also for easier configuration.

The idea of nsqlookupd, is to make sure the queue discovery still happens for qmd servers. Perhaps this can happen another way, but those are the intentions.

dkua commented 10 years ago

I understand the intentions for embedding these daemons into qmd. However like I mentioned in my previous comments

by coupling nsqd and qmd, we would lose the ability for a nqsd node to defer queued jobs to other qmd nodes in the situation it's host qmd node crashes
nsqlookupd is supposed to work as a centralized lookup directory for some subset of your nsqd nodes
- by coupling nsqlookupd and qmd, we would be negating the purpose of nsqlookupd
- instead of having 1 nsqlookupd node to k nsqd nodes for k <= n where n is all nsqd nodes
- we would have a 1:1 relation between nsqlookup and nsqd/qmd, this would force us to come up with another mechanism to find other nsqd nodes
- nsqd can only connect directly with one other node, either nsqlookupd or another nsqd

pkieltyka commented 10 years ago

Can nsqlookupd connect to other nsqlookupd node to join as a cluster? I'm pretty sure it can because I recall reading something about partitioning.

They should probably get rid of nsqlookupd and use consul instead which does distributed service discovery.

But anyways, for starters, just embed nsqd .. We can always use nsqlookupd separately and that's just fine. It can run on the same server as an nsqadmin.

On Jun 18, 2014, at 8:42 AM, David Kua notifications@github.com wrote:

I understand the intentions for embedding these daemons into qmd. However like I mentioned in my previous comments

by coupling nsqd and qmd, we would lose the ability for a nqsd node to defer queued jobs to other qmd nodes in the situation it's host qmd node crashes nsqlookupd is supposed to work as a centralized lookup directory for some subset of your nsqd nodes by coupling nsqlookupd and qmd, we would be negating the purpose of nsqlookupd instead of having 1 nsqlookupd node to k nsqd nodes for k <= n where n is all nsqd nodes
we would have a 1:1 relation between nsqlookup and nsqd/qmd, this would force us to come up with another mechanism to find other nsqd nodes nsqd can only connect directly with one other node, either nsqlookupd or another nsqd — Reply to this email directly or view it on GitHub.

dkua commented 10 years ago

No, nsqlookupd is intended to be a hub for nodes to connect and not a connector itself. There aren't any exposed methods for registering itself with another nsqlookupd node: http://nsq.io/components/nsqlookupd.html. nsqlookupd is a satisfactory solution for something that existed two years before consul 0.1.0.

I don't think we should be trying to merge QMD and nsqd together right now maybe in a future version/iteration. Not just because it introduces more code complexity and single points of failures. But because QMD works as it stands, I think we should be testing it and trying to find issues such as #5. And releasing it to production.

pkieltyka commented 10 years ago

I appreciate the feedback on what we should be doing "right now". So what have you been working on over the last two days?

On Jun 18, 2014, at 1:10 PM, David Kua notifications@github.com wrote:

No, nsqlookupd is intended to be a hub for nodes to connect and not a connector itself. There aren't any exposed methods for registering itself with another nsqlookupd node: http://nsq.io/components/nsqlookupd.html. nsqlookupd is a satisfactory solution for something that existed two years before consul 0.1.0.

I don't think we should be trying to merge QMD and nsqd together right now maybe in a future version/iteration. Not just because it introduces more code complexity and single points of failures. But because QMD works as it stands, I think we should be testing it and trying to find issues such as #5. And releasing it to production.

— Reply to this email directly or view it on GitHub.

pkieltyka commented 10 years ago

Ah issue #5 .. just saw that.

On Jun 18, 2014, at 1:10 PM, David Kua notifications@github.com wrote:

No, nsqlookupd is intended to be a hub for nodes to connect and not a connector itself. There aren't any exposed methods for registering itself with another nsqlookupd node: http://nsq.io/components/nsqlookupd.html. nsqlookupd is a satisfactory solution for something that existed two years before consul 0.1.0.

I don't think we should be trying to merge QMD and nsqd together right now maybe in a future version/iteration. Not just because it introduces more code complexity and single points of failures. But because QMD works as it stands, I think we should be testing it and trying to find issues such as #5. And releasing it to production.

— Reply to this email directly or view it on GitHub.

dkua commented 10 years ago

Also closed issues #2 and #4

pkieltyka commented 10 years ago

@dkua Check this out.. https://github.com/bitly/nsq/issues/388

pressly / qmd

Import nsqd and nsqlookupd directly into qmd #3

nsqlookupd

nsqd