jpaulm / jsfbp

FBP implementation written using JavaScript and node-fibers
MIT License
120 stars 23 forks source link

Back pressure #25

Open kenhkan opened 9 years ago

kenhkan commented 9 years ago

@tlrobinson @ComFreek

I've been trying to tackle back pressure in my fork, and I must say I'm hitting a dead-end and it'd be nice to hear your thoughts on this.

Take a look at this gist. Replace the content of recvr.js on @ComFreek 's fork and run fbptest01.js. You should see something similar to this. The timing doesn't matter, but (correct me if I'm wrong please @jpaulm ) the copier should stop after the 5th copy is sent, because of a capacity of 5 set on the connetion between copier and recvr.

I know that @ComFreek might not have gotten to that point in his fork. Me neither. But the issue is that it seems to be rather difficult in JS (at least with ES5 as that was my thesis).

With generators, the control flow is inverted here so there doesn't seem to be an obvious way to have the callee of yield to suspend, or continue to suspend, the caller (in this case, the FBP process). In this particular case, yield is returned a bluebird promise but Promise.coroutine() would continue the generator as soon as receive settles, which it must do. I haven't put too much thought into this problem in a streams-based solution, but I suspect that it's a similar situation.

Any idea on this problem?

ComFreek commented 9 years ago

Streams do support backpressure (with highWaterMark) — although I didn't get it to work (= the data just flows) and I currently don't have much time to investigate further. For this to work we have to introduce some kind of function whose call blocks the component. Using Promise.coroutine() and yield would look as follows:

// a simple copier
var ip;
while ((ip = yield inport.receive()) !== null) {
  yield outport.send(ip);
}

My implementation of OutputPort#blockingSend looks as follows:

OutputPort.prototype.sendBlocking = function(ip) {
  var self = this;

  function write(_ip, resolve) {
    var ok = self.conn.write(_ip);
    if (ok) {
      resolve();
    }
    else {
      self.conn.once('drain', function () {
        write(_ip, resolve);
      });
    }
  }

  return new Promise(function (resolve, reject) {
    write(ip, resolve);
  });
};

The call to self.conn.write() tries to write data to the ProcessConnection. Ideally, the ProcessConnection returns false in case the limit of unread IPs (given by the so-called 'capacity') is reached. The ProcessConnection should also send a 'drain' event in case an IP has been read, so that an IP can be pushed.

The problem is that we introduce yet another ugly 'yield' statement. 'yield' makes perfect sense when reading, but I don't like its appearance when sending.

tlrobinson commented 9 years ago

In cFBP is back-pressure considered "advisory" (i.e. to prevent processes from being overwhelmed with IPs), or is strict/immediate back-pressure feedback necessary for correctness?

I ask because my implementation using Node streams (https://github.com/tlrobinson/sbp discussed in #14) currently supports back-pressure, but suffers from the problem you mentioned in another issue where it takes ~N writes for back pressure to be communicated up an N-deep pipeline of streams (https://github.com/dominictarr/pull-stream#transparent-backpressure--laziness). If a few extra IPs aren't a problem this isn't a big deal, otherwise it is.

tlrobinson commented 9 years ago

Also, I don't consider requiring yield in front of sends to be a big deal. In fact it's kind of a nice symmetry. Though it would be good if we could somehow detect if a component accidentally leaves off a yield in front of a send.

kenhkan commented 9 years ago

@ComFreek Your implementation makes a lot of sense! Thanks for the pointers. I was trying to get more ideas as to how this could possibly be done. And I agree with @tlrobinson that the yield isn't out of place at all, but yea, I also think that an uninitiated component designer would leave it off and has no idea what's going on.

@tlrobinson To my knowledge back-pressure is pretty important and not advisory. It's one of the main issues @jpaulm has with noflo. There are scenarios where you need strict back-pressure. An example would be a loop-type network, where you have interdependence between two or more processes. Not sure how a stream would behave in that setup though as I haven't tried using streams that way.

tlrobinson commented 9 years ago

@kenhkan If that's the case then "pull" type streams are the only way to connect ports, right?

How are processes distributed across OS processes or physical machines/networks? Does a "receive" essentially send a blocking request (RPC?) for the next IP to the upstream process?

Are there any example/documented network protocols for cFBP implementations? I think that would help clarify it a bit for me.

kenhkan commented 9 years ago

@tlrobinson I'm not sure actually, as I'm in the exploratory phase as well. All I know is that what Paul envisions as "push" is like a process "pushing" control flow to its neighbors when IPs are moved, be it upstream or downstream. Say, A -> B -> C, the connection between A and B is full, and the connection between B and C is empty. When B has processed an IP, A would unblock, so it's "pull" here. Though, the reverse is also true, C would block because of an empty upstream; once B sends the IP forward, C would unblock. So now it's "push". I think though the two terms are heavily overloaded. Is this what you were asking?

Distributed FBP I believe would be some kind of TCP-like setup. Matt who worked on a Tcl implementation of FBP has each process as a separate OS-level process and communicate via TCP sockets. Blocking is achieved with SYNs and ACKs. The upside of his setup is of course instant distributability given that it's using the internet protocol already.

For documentation on any pre-existing network protocol, @jpaulm would be the best person to answer.

tlrobinson commented 9 years ago

By "pull" I mean the downstream processes need to explicitly tell the upstream processes whenever they can accept another IP, e.x. "ok now you can give me a single IP" vs. "ok now you can give me IPs until I tell you to stop".

Or perhaps if a connection has a capacity >1 it can say "ok now you can give me up to X number of IPs, until I tell you I you can give me more" (similar to TCP windows).

Either way would require explicit acks of some kind, to continually tell upstream processes how many IPs they're allowed to send. That's why I was curious about any existing FBP network protocols.

kenhkan commented 9 years ago

Got it. That makes sense. It sounds like your second approach is more fundamentally sound. And yes, I'm pretty sure for FBP semantics to work you'd need explicit acks.

@jpaulm Any insight on a network protocol for FBP?

jpaulm commented 9 years ago

FWIW, almost 50 years ago, we discovered that acks and naks are not adequate for full duplex lines! I guess that's well-known now!

My Socket implementations (not Web Sockets) on JavaFBP and C#FBP are basically one-way, except that they wait for a response after every 20 messages. Would this help?

Lastly I believe @jonnor has done the most work on protocols...

Regards,

Paul On Mar 4, 2015 10:39 AM, "Kenneth Kan" notifications@github.com wrote:

Got it. That makes sense. It sounds like your second approach is more fundamentally sound. And yes, I'm pretty sure for FBP semantics to work you'd need explicit acks.

@jpaulm https://github.com/jpaulm Any insight on a network protocol for FBP?

— Reply to this email directly or view it on GitHub https://github.com/jpaulm/jsfbp/issues/25#issuecomment-77168623.

kenhkan commented 9 years ago

Is FBP duplex at all? When you say A -> B, B really has no way and no need to communicate with A except to tell A that it can no longer except more IPs (and of course to tell it that the port is already taken, but that's at initiation time). So wouldn't acks/naks work?

jpaulm commented 9 years ago

No, I was thinking of a network where some of the links are running on full duplex long distance lines. Of course the links themselves are one-directional.

Also I'm not totally comfortable with the terminology "B tells A" - B doesn't know A exists or v. v. ! On Mar 15, 2015 1:40 PM, "Kenneth Kan" notifications@github.com wrote:

Is FBP duplex at all? When you say A -> B, B really has no way and no need to communicate with A except to tell A that it can no longer except more IPs (and of course to tell it that the port is already taken, but that's at initiation time). So wouldn't acks/naks work?

— Reply to this email directly or view it on GitHub https://github.com/jpaulm/jsfbp/issues/25#issuecomment-81175586.