cascadium / OpenMAMA-zmq

OpenMAMA ZeroMQ Bridge
MIT License
17 stars 8 forks source link

single endpoint for pub/sub #14

Closed WallStProg closed 7 years ago

WallStProg commented 7 years ago

Curious about the configurations for pgm/epgm -- specifically, why not use the same endpoint for both pub and sub?

I've done some testing locally with the following configuration in mama.properties, and it all seems to work fine:

mama.zmq.transport.pubsub_epgm.outgoing_url_0=epgm://127.0.0.1;239.192.1.1:5657
mama.zmq.transport.pubsub_epgm.incoming_url_0=epgm://127.0.0.1;239.192.1.1:5657

Then passing -tport pubsub_epgm to both publisher and subscriber.

I'm able to start multiple publishers and subscribers, and all subscribers get all messages from all publishers.

That setup seems much simpler, esp. for applications which are both publishers and subscribers.

So, I'm wondering if I'm missing something here? Are there reasons why one would not want to do this?

TIA...

fquinner commented 7 years ago

It's to separate subscription streams (typically low traffic) from publish streams (typically high traffic). Otherwise the publisher will do an igmp join of its own data (and others if multicast group is shared) which zeromq would then start chewing through only to discard which would hurt performance and queue up subscriptions trying to get through. Both will work fine though but I always thought separation was cleaner.

WallStProg commented 7 years ago

Makes perfect sense for typical market-data applications.

Apparently there used to be an option in 0MQ to disable the behavior where the sender receives its own messages, but that has apparently been removed as of version 3 (https://raw.githubusercontent.com/zeromq/zeromq3-x/master/NEWS). In any event, that was only at the host level with PGM, so would not really have helped -- it would need to be at the interface:port level to potentially be useful.

It also looks like 0MQ's PGM implementation has some issues with multiple senders on a single host, whether or not that option was enabled (https://lists.zeromq.org/pipermail/zeromq-dev/2012-August/017598.html).

Having said all that, the ability to support direction-less protocols (similar to RV) might come in handy for applications that are both producers and consumers of data.

Any additional hints or suggesting on OpenMAMA/0MQ with PGM would be much appreciated -- thanks!

fquinner commented 7 years ago

If you're actively consuming from the same multicast group though, zmq is going to receive everything published onto that group (e.g. other apps or if the switch decides to rebound your egress messages).

If volumes are low and loss is fine, yeah pub and sub being on the same multicast group should be absolutely fine though.

This might seem like a bit of an off-the-wall suggestion but I would give serious consideration for most use cases of using zmq proxies. They're extremely lightweight.

Also, have you seen this? Lists out a few options.

http://fquinner.github.io/2015/12/05/openmama-and-zeromq-fanout/

I'm not sure pgm is actually that useful unless you have an extremely large number of clients some of which might be badly behaving. In off-the-cuff testing I have done in the past, contrary to my expectations, it actually didn't perform as well as tcp at higher rates. That was on a laptop with loopback multicast enabled though maybe proper kit would do better.

fquinner commented 7 years ago

Also newer configuration style options for direct fanout if you'd rather avoid an intermediary:

http://fquinner.github.io/2016/09/02/zeromq-1.0-rundown/

WallStProg commented 7 years ago

Yup, the proxy suggestion makes sense, and I had already thought of that. I'm guessing that RV does something similar with the rvd -- one side is IPC/TCP, the other multicast, so there is only (at most) one multicast sender/receiver on each host.

Even with a proxy architecture you'd need to deal with intra-host messages somehow -- either by receiving them via loopback, or internally routing messages to connected subscribers.

WallStProg commented 7 years ago

And yes, thanks -- I've been reading the blog too. Good stuff, and has helped quite a bit working through the different options for connecting publishers and subscribers.

The post you mention shows 1<->many, and many<->1, but I'm trying to come up with a many<->many solution. The multicast approach is obviously simpler, but as you point out has its own set of drawbacks. As much as possible I'd like to avoid any kind of proxy/broker in the middle because of the extra hop, but that's likely not practical.

And I'm not at all surprised that TCP performs better, at least until fan-out gets rather high. Of course, with TCP you have the problem of discovery, unless your configuration is relatively simple and/or static.

fquinner commented 7 years ago

When I mentioned a proxy I actually meant the zmq proxy with tcp north and south. Its actually pretty effective for fanout because in zmq tcp, the filtering happens on the publish side (these days). That means its light on the subscriber. On the fan in side, the publishing application only publishes once and only for requested topics (even if it is tcp) so its light on the publisher. The proxy obviously does a lot of work, but its highly specialised and so should be highly cache efficient.

If you want to go down the pgm route, you should be able to specify as many pub and sub multicast groups as you like, so you could have a group that is only for intra host traffic, another for a certain venue split etc. for many to many multicast configuration. As you said though it can become difficult to configure when you fan out as zeromq has no built-in topic resolution / discovery. It should be able to do it though.

WallStProg commented 7 years ago

Well, the proxy approach would certainly simplify things, but I see a couple of problems with it. First is the extra hop, which is bad for latency. Throughput is limited as well, assuming that the proxy is single-threaded (and if it's not, how does it maintain ordering?). A bigger problem is that the proxy becomes a single point of failure.

There are ways to mitigate both problems by sharding (which increases complexity), but that only mitigates the problems, it doesn't eliminate them.

fquinner commented 7 years ago

Yeah its hard to say which is best without fully understanding your use case, message rates, consumer types and latency requirements etc but the good news is that the zmq bridge gives you many options :).

fquinner commented 7 years ago

Closing this off for tidy up - i think this is as far as we can go for this - it's really down to the options which ZMQ itself can offer cross referenced with the use case.