Problem : zproto codec+server doesn't handle multiframe routing_ids

jemc commented 10 years ago

Currently, the codec and server code seems to be written with the assumption of only a single frame of routing in mind. https://github.com/zeromq/zproto/blob/master/src/zproto_codec_c.gsl#L874 https://github.com/zeromq/zproto/blob/master/src/zproto_codec_c.gsl#L317 https://github.com/zeromq/zproto/blob/master/src/zproto_server_c.gsl#L257

This means that clients must be connected directly to the server, and can't be behind any multiplexing proxy layers. Is this an acceptable limitation, or do we need to revise our strategy to account for this usage?

If we do not accept this limitation, I can see two broad categories of approach for solutions:

rewrite the server and codec structures to store the routing info as a zlist of zframes instead of a single zframe.
rewrite the recv end of the codec to serialize multiple zframes of routing into a single zframe for storing, rewrite the send end of the codec to deserialize the single zframe into multiple zframes to use as the routing for the outgoing message.

If there is another approach that is more appropriate, let's discuss it!

hintjens commented 10 years ago

First, I'd have to ask what the problem is that you're solving using multiplexing proxies.

The protocols that we implement have zero frames of routing. They are pure client/server protocols. Look e.g. at https://github.com/Malamute/malamute-core/blob/master/src/mlm_msg.bnf.

So mainly, yes, clients talk directly to servers and there is no protocol-level concept of proxying. Thus, no implementation of that in the generated code either.

In any real design there is proxying happening, however it's internal to the server and hidden from the clients. This is also my preference: that a protocol act as a single contract for a single layer. I do not like, and do not use, the multi level routing envelopes from REQ-REP for that reason (they are layer violators).

Anyhow, please provide me the problem statement, and we can discuss solutions. Thanks.

jemc commented 10 years ago

@hintjens

Sure, I'm happy to discuss it. To be clear I'm also open to accepting the answer that my use case is outside of the scope of those targetted by zproto. If this is true, I am happy to write my own modifications to the zproto codec gsl template to essentially take the second option I cited above so that I can make a codec API compatible with the existing single-frame server template and proceed. However, my first inclination if I'm going to do this extra work is to try to figure out if I can share it back to zproto for others to benefit from and contribute to; that is my motivation for moving in this way rather than privately working toward a solution.

To get to the point regarding my usage of proxies:

We have an architecture where many client actors live within a single application on a single machine, and there are one or more of these machines in a system. There are also one or more servers that any given application-bearing machine might connect to. On each of these machines, it is in our interest to be able to change the server that the application is connected to, whle making sure that all of the client actors in a client application are all connected to the same server at any given time (or no server at all, when they are in the middle of a change). It is easier to maintain certain guarantees about consistency this way.

We determined in our current design that the easiest way was to do this through a single proxy per machine, which handles the disconnect and reconnect to a different server. We designed all of our layers to handle arbitrary levels of proxying, looking for the delimiter frame to place the routing frames instead of assuming a certain number of them. We were/are under the impression that this was along the lines of ZMQ best practices.

There may be other ways to do the switchover for all clients, like sending a synchronous command to each actor to disconnect and waiting for all of those to complete before sending the command to each actor to connect to the new choice of servers, but even then, some actors may have received data that others did not receive before disconnecting. By making the disconnect a single operation, we avoid those kinds of problems.

Again, I am content to rewrite the codec generator for our own project to account for this use, but if there is a way to generalize my solution as part of zproto so others can benefit and contribute, I'd like to do so.

hintjens commented 10 years ago

OK, thanks for explaining the problem.

Here's my proposal for a solution:

You want to add a routing envelope that can hold zero or more frames. This could go at the start or end or middle of your commands, it doesn't really matter. For consistency, the start is easiest.

And you want to encode/decode this envelope as we do for other data types. There is no need to use zeromq message frames, rather you can use a list of frames and serialize as we do other objects, into the current frame.

Then the proxies can read the message, add/remove addresses, and route the message.

Does that sound right?

Basically, add a new type, "envelope", and decide on a efficient binary encoding that matches the general style of the protocols we make.

-Pieter

On Tue, Oct 14, 2014 at 8:12 PM, Joe Eli McIlvain notifications@github.com wrote:

@hintjens https://github.com/hintjens

Sure, I'm happy to discuss it. To be clear I'm also open to accepting the answer that my use case is outside of the scope of those targetted by zproto. If this is true, I am happy to write my own modifications to the zproto codec gsl template to essentially take the second option I cited above so that I can make a codec API compatible with the existing single-frame server template and proceed. However, my first inclination if I'm going to do this extra work is to try to figure out if I can share it back to zproto for others to benefit from and contribute to; that is my motivation for moving in this way rather than privately working toward a solution.

To get to the point regarding my usage of proxies:

We have an architecture where many client actors live within a single application on a single machine, and there are one or more of these machines in a system. There are also one or more servers that any given application-bearing machine might connect to. On each of these machines, it is in our interest to be able to change the server that the application is connected to, whle making sure that all of the client actors in a client application are all connected to the same server at any given time (or no server at all, when they are in the middle of a change). It is easier to maintain certain guarantees about consistency this way.

We determined in our current design that the easiest way was to do this through a single proxy per machine, which handles the disconnect and reconnect to a different server. We designed all of our layers to handle arbitrary levels of proxying, looking for the delimiter frame to place the routing frames instead of assuming a certain number of them. We were/are under the impression that this was along the lines of ZMQ best practices.

There may be other ways to do the switchover for all clients, like sending a synchronous command to each actor to disconnect and waiting for all of those to complete before sending the command to each actor to connect to the new choice of servers, but even then, some actors may have received data that others did not receive before disconnecting. By making the disconnect a single operation, we avoid those kinds of problems.

Again, I am content to rewrite the codec generator for our own project to account for this use, but if there is a way to generalize my solution as part of zproto so others can benefit and contribute, I'd like to do so.

— Reply to this email directly or view it on GitHub https://github.com/zeromq/zproto/issues/126#issuecomment-59090608.

jemc commented 10 years ago

I think I'd prefer to keep using "dumb" proxy layers (simple zmq_proxy) if I can, rather than have the proxies unpack and repack or tack on bytes to an envelope frame. It sounds like this is what you're suggesting.

Also, under that regime - if envelope is just the type of a user field specified in the model - then it sounds like it won't be used by the generated server to differentiate clients that come from the same near layer of proxy to spin up a separate s_client_t for each. And if this is the case, I lose a lot of the benefit of the generated server because I have to manage my own client state machines at that point.

Please correct me if I misunderstood your proposal.

I think I'm better off hacking the logic that pulls out the my_msg_t::routing_id of the incoming zmsg_t to use multiple frames up to the delimiter binary encoded as a single zframe to use for the routing_id. If you're not comfortable with this kind of a change, then I think I'll just do it on the project level. As long as I keep the API for my_msg_routing_id () and my_msg_set_routing_id () intact, I should still be able to use the generated zproto server to do my bidding, and this is still a net win for me.

Either way, thanks for the dialogue, and thanks for all your work on the zeromq family of projects. Especially czmq, whose clean interfaces and succint usage have me excited to program in C again!

hintjens commented 10 years ago

Glad you enjoy CZMQ. It is fun to write C like this...

For the routing, it seems to me that you're mixing up different layers and particularly what goes on the wire vs. what the ROUTER socket gives you (that routing id is not not part of the wire protocol).

So, my advice is to simply fork and hack zproto_codec and experiment until you get working models. You can also fork the server codec as you like. The GSL code is very easy to change.

And once you have a clear design, we can see whether it's generally useful or not. There's a little cost to maintaining your own generators though really not that much. In fact I'd recommend this as an exercise anyhow, it'll teach you how GSL works and you'll enjoy using that in other cases.

On Wed, Oct 15, 2014 at 3:58 AM, Joe Eli McIlvain notifications@github.com wrote:

I think I'd prefer to keep using "dumb" proxy layers (simple zmq_proxy) if I can, rather than have the proxies unpack and repack or tack on bytes to an envelope frame. It sounds like this is what you're suggesting.

Also, under that regime - if envelope is just the type of a user field specified in the model - then it sounds like it won't be used by the generated server to differentiate clients that come from the same near layer of proxy to spin up a separate s_client_t for each. And if this is the case, I lose a lot of the benefit of the generated server because I have to manage my own client state machines at that point.

Please correct me if I misunderstood your proposal.

I think I'm better off hacking the logic that pulls out the my_msg_t::routing_id of the incoming zmsg_t to use multiple frames up to the delimiter binary encoded as a single zframe to use for the routing_id. If you're not comfortable with this kind of a change, then I think I'll just do it on the project level. As long as I keep the API for my_msg_routing_id () and my_msg_set_routing_id () intact, I should still be able to use the generated zproto server to do my bidding, and this is still a net win for me.

Either way, thanks for the dialogue, and thanks for all your work on the zeromq family of projects. Especially czmq, whose clean interfaces and succint usage have me excited to program in C again!

— Reply to this email directly or view it on GitHub https://github.com/zeromq/zproto/issues/126#issuecomment-59147712.

jemc commented 10 years ago

I'll go ahead and close this until I've got it working how I want it in my private project and want to present something back to the group. Thanks for the dialogue.

hintjens commented 10 years ago

OK, anytime!

On Wed, Oct 15, 2014 at 5:26 PM, Joe Eli McIlvain notifications@github.com wrote:

Closed #126 https://github.com/zeromq/zproto/issues/126.

— Reply to this email directly or view it on GitHub https://github.com/zeromq/zproto/issues/126#event-178894196.

zeromq / zproto

Problem : zproto codec+server doesn't handle multiframe routing_ids #126