spdy-http2 / node-spdy

SPDY server on Node.js
2.81k stars 196 forks source link

Bug in spdy-transport/agent code causes writing of frames after receiving GOAWAY #291

Closed anandsuresh closed 7 years ago

anandsuresh commented 7 years ago

We're seeing a more than a few of the following exceptions when using node-spdy with Apple's APNS service over HTTP/2.

Cannot read property 'writeQueueSize' of null

TypeError: Cannot read property 'writeQueueSize' of null
    at Socket._writeGeneric (net.js:703:32)
    at Socket._write (net.js:716:8)
    at doWrite (_stream_writable.js:300:12)
    at clearBuffer (_stream_writable.js:407:7)
    at Socket.Writable.uncork (_stream_writable.js:237:7)
    at ClientRequest._flushOutput (_http_outgoing.js:673:10)
    at ClientRequest.OutgoingMessage._flush (_http_outgoing.js:647:16)
    at _http_client.js:156:10
    at callSocketMethod (_http_client.js:552:7)
    at ClientRequest.onSocket (_http_client.js:557:7)

I have a feeling this is related to the remote server sending back a GOAWAY frame, and the local server attempts to create a new stream/request on the same agent and gets hosed. I'm looking into it now, but wanted to get some input from the community if anyone else is seeing this issue?

anandsuresh commented 7 years ago

Digging into the code, I came across something that might be responsible for this crash. This case is a little hard to reproduce with a test and it only manifests at scale, so I'm going to explain the sequence of events that might lead to the error described in this issue.

Please bear with me... this is a little long, but I've included links to code in the description to make it a little easier to follow.

Agent.prototype._createSocket() calls Handle.prototype.assignClientRequest() (here), which monkey-patches OutgoingMessage.prototype._send() (here) to emit needStream (here), which is being listened for in Agent.prototype._createSocket() (here), which in-turn creates a new stream and passes it to Handle.prototype.setStream() (here), which emits the newly-created stream as an argument to the stream event (here).

However, just before Handle.prototype.assignClientRequest() monkey-patches the OutgoingMessage.prototype._send() method, it has the following code:

    // It is very important to leave this here, otherwise it will be executed
    // on a next tick, after `_send` will perform write
    self.getStream(function(stream) {
      stream.send();
    });

If I'm reading the code correctly, then it seems that the intent is to have the above event-handler call stream.send() to write out a HEADERS frame for the new outgoing request, before the original req._send() method is restored and executed (here). Furthermore, the call to the monkey-patched req._send() from node-core code in net.js (here) causes the above sequence to execute synchronously on the same tick, which I expect is intended by design.

To recap for clarity, when net.js calls the monkey-patched req._send() method, it causes the emission of the needStream event, whose handler in Agent.prototype._createSocket() creates a new stream and passes it to handle.setStream(), which in-turn emits the newly created stream as an argument to the stream event on the Handle, whose handler (described in the code above) is setup through Handle.prototype.getStream() and awaiting that event to fire, in order to write out the HEADERS frame for the outgoing request.

While this works fine in the normal case, consider the case when the remote server has sent a GOAWAY frame, that the local Connection instance has processed and set state.goaway to the frame id specified by the GOAWAY frame.

With the GOAWAY state now setup to end the connection in the near future, a call to the needStream handler in Agent.prototype._createSocket() would end up calling the Connection.prototype.reserveStream() method (here), which queues up an error to fire off on the next tick, but still returns the newly created stream to maintain the API expectation. This newly created stream will propagate as described above and will cause the Stream to execute stream.send(), which will write out a HEADERS frame for the outgoing request!

Stream.prototype.send = function send(callback) {
  var state = this._spdyState;

  if (state.sent) {
    var err = new Error('Stream was already sent');
    process.nextTick(function() {
      if (callback)
        callback(err);
    });
    return;
  }

  // ==============================================
  // CHECK IF THE REMOTE SERVER HAS TOLD US TO GOAWAY!!!
  if (this.connection._isGoaway(this.id))
    return callback();
  // ==============================================

  state.sent = true;
  state.timeout.reset();

I am considering adding the above check to the Stream.prototype.send() method, but I'm not sure if that would be sufficient to prevent the susequent call to req._send() here.

Perhaps you have a better idea on how to deal with this.

avaer commented 7 years ago

This started happening with me on recent versions of Chrome/Blink. Originally I thought it was a Chromium bug, but it might be down to node-spdy.

anandsuresh commented 7 years ago

@modulesio Can you test with https://github.com/anandsuresh/node-spdy/tree/fix/agentGoAway to see if the problem gets fixed?

avaer commented 7 years ago

☝️ Yes, that seems to do the trick 👏.

On that PR I get correct loading of the page on the latest Chrome (Version 54.0.2840.87 (64-bit)); without it loading just completely aborts.

anandsuresh commented 7 years ago

@modulesio Excellent!