nodejs / CTC

Node.js Core Technical Committee & Collaborators
80 stars 27 forks source link

HTTP/2 support in Core #6

Closed jasnell closed 8 years ago

jasnell commented 8 years ago

We've had an on again off again discussion happening for some time about getting http/2 support into core. While http/2 has it's challenges, use is growing and there are real benefits realized through its use. I think it's worth moving forward on.

In a brief conversation with @indutny regarding node-spdy (his spdy and http2 implementation), he indicated that node-spdy is not yet fully compliant or optimized enough to the spec and would need additional work. It could give us a good place to start, however, to figure out what would need to be done to get http/2 support in core.

There are a few key concerns that we would need to look at (here are a few off the top of my head):

  1. API: while http/2 maintains the existing protocol semantics of http/1 in terms of requests, responses, headers, etc, there are a number of key differences that make it challenging as far as API support goes: specifically, multiplexed streams and push streams would need to be figured out.
  2. Stateful header compression. http/1 is fully stateless. http/2, unfortunately, is not. There is a stateful header compression algorithm (hpack) that is used to compact headers. This compression state is held and kept synchronized by both sides of the connection (separate state for each direction). The recipient determines how much data can be stored while the sender determines what gets stored. The key challenge for Node.js, of course, is the additional memory and processing requirement this introduces for each connection. Specifically, full processing of all header frames and compression state is required, even if the stream is being rejected or ignored. Failure to do so would put the connection into an inconsistent state and would require that the connection be torn down.
  3. http/2 requires that, as much as possible, peers multiplex requests over a single connection. Currently in node, when we receive a single malformed request, we simply drop the connection without further processing (which is fairly unfriendly behavior on our part and likely should be fixed). With http/2, requests arrive in a stream, which means we would RST_STREAM a rejected request rather than shutting down the connection. For well-behaved clients, this is all fine and good, but does present some challenges when dealing with denial-of-service attack scenarios. We'll have to investigate how some of the other server implementations deal with this and will need to come up with a strategy for detecting and dealing with misbehaving clients. Tearing down the connection is still an option but could be a bit more expensive.
  4. While http/2 does not strictly require the use of TLS, Chrome, Firefox and other browsers require TLS when using http/2. ALPN is used as the primary upgrade/negotiation mechanism for determining that http/2 will be used. This presents a bit of a challenge given that, despite having https support in core, Node.js never has been the most performant way of doing TLS termination. We'll have to pay close attention to the performance impact of this and find ways of making it faster.
  5. http/2 uses binary frames to implement it's multiplexed communication. These are extensible. As part of the API considerations, we will need to determine if we want to support the ability for userland to implement extension frames or if we want to lock that down so that only core can introduce extensions.
  6. http/2 includes it's own flow control and stream priority mechanisms that are mandatory to implement. These are independent of TCP's own flow control and priority mechanisms. This presents a challenge only because it is additional work and state that Node.js would be required to maintain per-connection and per-stream within a connection. For priority, we would need to determine the API for specifying the priority of a group of streams.
  7. Despite being a fundamentally different kind of protocol than http/1, http/2 uses ports 80 and 443 by default and relies on either HTTP Upgrade or ALPN for protocol negotiation. When establishing a TLS connection, ALPN can be used to indicate that http/2 will be used for communication; this is the ideal bootstrap path but brings along it's own challenges. When not using TLS, a fake HTTP/1.1 request is sent with an Upgrade header that requests switching to http/2. This is similar to the mechanism used to upgrade to use of Web Sockets on a connection. What this is supposed to do is make it so that a connection can be either http/2 or http/1 depending on what the client requests (e.g. we could have a single server that speaks either on a per-connection basis). We would need to decide if we want to support the ability to create a single http.Server instance that supports both http/1 and http/2, or if we want to restrict it to one or the other.
  8. On the client side, we would need to determine the API for things like pushed streams. A server can only push a stream in response to an initial client request, but the end result is that there can be N-responses for any single request. These responses take the form of separate streams (they are essentially responses to assumed requests), so on the client side they would need to be handled just like they are regular responses except for the fact that there would be no corresponding request already. What I imagine is a new 'push' event on a ClientRequest object would make the most sense here but we would need to work through the API details.

Ultimately, I believe this is something that needs to be implemented and I plan to start work on this in the very near future (matter of weeks), beginning with a rough outline of the new API. What I hope to do is begin introducing experimental support for http/2 in the same basic way that we've handled async_wrap -- that is, slowing introducing the elements into core as unsupported experimental bits. It will take some time to get right.


Implementations: https://github.com/http2/http2-spec/wiki/Implementations

benjamingr commented 8 years ago

@jasnell

Ultimately, I believe this is something that needs to be implemented and I plan to start work on this in the very near future (matter of weeks), beginning with a rough outline of the new API.

First of all - thank you. This sounds like a huge undertaking, I think it would be great if you build it in a way that others can work on it simultaneously and review the code. I think making a "shabang" pull request of the whole thing could be really risky where a more iterative model would be nicer.

jasnell commented 8 years ago

@benjamingr ... all work I do here will be done openly and I would gladly accept help from any and all :-)

jasnell commented 8 years ago

Btw, fwiw, the most likely path that I'll be exploring first on this will be to integrate nghttp2 (https://nghttp2.org/documentation/index.html) into Node.js (https://nghttp2.org, https://github.com/nghttp2/nghttp2, https://github.com/nghttp2/nghttp2/blob/master/COPYING). It is written in C, uses a callback model, does not insist on doing it's own I/O (so it can be integrated with libuv easily), and is MIT licensed. It is also one of the most compliant library implementations available.

Fishrock123 commented 8 years ago

I'm curious what the API would like but I'm less convinced that that should be in core than I used to be, simply because it will be a massive overhead on an already difficult to maintain core module.

The HTTP module should probably be re-written before/alongside of this to keep it from being a huge future disaster...

jasnell commented 8 years ago

How do you figure the overhead would be "massive" or the implementation a "disaster"? Have you taken the time to look into what is required for an implementation?

The API can be quite similar but the internals would share very little code. In fact, after having spent the past week working on an impl that works in core, the http/2 impl should be far less complicated using a vendored lib like nghttp2. It actually works quite well tho there are quite a few little details to work through. I've yet to find something that doesn't work. On Jun 17, 2016 8:19 AM, "Jeremiah Senkpiel" notifications@github.com wrote:

I'm curious what the API would like but I'm less convinced that that should be in core than I used to be, simply because it will be a massive overhead on an already difficult to maintain core module.

The HTTP module should probably be re-written before/alongside of this to keep it from being a huge future disaster...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nodejs/CTC/issues/6#issuecomment-226798416, or mute the thread https://github.com/notifications/unsubscribe/AAa2eRTc-0m5-5ZP8OCFHfkmDLzGl_7dks5qMrsYgaJpZM4IyGEn .

indutny commented 8 years ago

uv_link_t!! 😉

We should just use it. It is:

I guess we may even have a common C interface for both HTTP/1.1 and HTTP/2.0!

jasnell commented 8 years ago

Quick update on what I've been able to get done this week on this...

I elected to start with the excellent and very well tested nghttp2 library to provide the bulk of the http/2 implementation details. This library provides a straightforward C API for handling all of the http/2 details. The library does not do it's own I/O which is a good thing. I have the start of a src/node_http2.cc that exposes a process.binding('http2'). Currently, this is essentially a thin wrapper for the nghttp2 API but that could evolve.

In a new internal module (internal/http2'), I define a Http2Server class that extends from tls.Server. ALPN and NPN are set to request h2 and hc (two commonly used http/2 identifiers). Once the TLS connection is established, the data received on the tls.Socket is passed into the underlying nghttp2 session (which is represented in js-land by a Http2Session object exposed by the process.binding('http2') binding. nghttp2 does all of the heavy lifting via a number of callbacks. The model is not unlike the way the existing http-parser and http module work together. Data is passed back from the nghttp session to the tls.Socket via callbacks.

I am in the process of implementing Http2Request and Http2Response classes that implement the same basic API as the existing http module but all of the internals are different. In fact, so far, the http2 module shares only one bit of code with http, and that's simply to validate that the status code is valid. There will be some differences in the API based on the fundamental differences between http/1 and http/2, but for the most part the APIs will be very similar.

While I have only just scratched the surface of the implementation, I do have the following simple test case working using both chrome and nghttp2's own command line client:

'use strict';
const http2 = require('http2');
const fs = require('fs');
var options = {
  key: fs.readFileSync('/Users/james/tmp/http2-key.pem'),
  cert: fs.readFileSync('/Users/james/tmp/http2-cert.pem')
};
var server = http2.createServer(options, (req,res) => {
  // echo the request data
  res.statusCode = 200;
  res.setHeader('content-type', req.headers['content-type']);
  req.pipe(res);
});

server.listen(8000);

As you can see, the basic API is essentially identical.

Once I'm a bit further along in the prototype implementation, I will be opening a node-eps that proposes adding the nghttp2 implementation behind a compile-time flag, with the implementation being marked as experimental in v7.

So far, the implementation is actually quite a bit simpler than the existing http module and we would be able to refactor the existing http module without having any impact on the http2 implementation -- in fact, it would be best to simply treat them as unrelated modules, particularly given the fundamental differences between the two protocols.

The current implementation work is quite rough as my goal right now is to simply have something that works then iterate towards something better. But if you'd like to follow along with what I've been doing, the dev branch is here: https://github.com/jasnell/node/tree/http2

jasnell commented 8 years ago

@dougwilson @nodejs/http ... I would love to get input / perspective from the ecosystem of HTTP-related module / framework developers on what would be required (if anything) from Core with regards to HTTP/2 support.

Trott commented 8 years ago

I think this probably doesn't need to be in the CTC-specific repo anymore. There's an issue in the NG repo and @jasnell is working on an implementation. Seems like further discussion could probably happen in the main node repo if necessary. Closing. Feel free to re-open if you disagree.