nodejs / node

Node.js JavaScript runtime βœ¨πŸ’πŸš€βœ¨
https://nodejs.org
Other
107.79k stars 29.7k forks source link

Adding Websocket support to core #19308

Closed MylesBorins closed 4 months ago

MylesBorins commented 6 years ago

The original thread where adding this was discussed #1010 was closed with a decision by the iojs TC to rather implement lower level buffer methods, but that was abandoned.

There is an open EPS to add the feature, but we have since abandoned the process.

Some of the people who originally were -1 changed their opinions in #1010 more recently. In fact, we already ship a partial implementation of ws in the inspector.

I think it might be worth us revisiting adding WS to core.

/cc @eugeneo @rauchg

eugeneo commented 6 years ago

Re: Inspector.

  1. Inspector WS implementation is not complete, e.g. there is no support for binary frames.
  2. Inspector would still need a C++ implementation that can run on a separate thread. Both JS execution and main libuv loop are suspended when the application hits a breakpoint.
devsnek commented 6 years ago

i'm not wholly against this but i would like to factor in how much stuff we put into all our release binaries. if we can come up with more creative ways for shipping stuff like this i'm totally a +1 (#19307)

mscdex commented 6 years ago

I am still -1 on this.

devsnek commented 6 years ago

@mscdex can you be explicit in your reasoning?

mscdex commented 6 years ago

@devsnek for the same reasons I gave in the linked issue.

lpinca commented 6 years ago

As discussed in #1010 and as a maintainer of ws I'm +1 on adding WebSocket to core.

MylesBorins commented 6 years ago

@mscdex in #1010

-1 WebSockets is still something better suited for userland IMHO. Sure it's something that many people do use, but there are also many other widely used standardized protocols/APIs that could be argued as "core" to the "web" that are not currently included in node/io.js core. For example: multipart parsing/generation, Server-sent events, HTTP/2, ICMP, SSH/SFTP, FTP, SOCKS, VNC/RFB, SMTP/IMAP/POP3, SOAP, Web Workers (as an API), XHR/XHR2 (as an API), etc.

Since this original post we've added http2. While you listed a bunch of protocols not all of them are supported natively by the browser.

https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API

https://caniuse.com/#search=Websockets

mscdex commented 6 years ago

@MylesBorins I don't think node should aim to become a (DOM-less) browser. Anyway, my "vote" and reasoning still stands.

targos commented 6 years ago

It's more about communicating with browsers, not becoming one.

bnoordhuis commented 6 years ago

http2 is an argument against being too eager to absorb protocols into core. Neither its API nor its implementation are all that great; it would have benefited from iterating outside core for a while.

I guess you could construe that as an argument in favor of websockets: third-party modules have existed for years and their APIs and implementations have pretty much crystallized by now.

brianleroux commented 6 years ago

This is a great idea. While today Node is already an indispensable tool for web developers it does not enjoy the same full seat at the table of web browser tech advancement despite being held completely captive by it. Few agree on the controversial edicts like Promise and esmodules but we can definitely all agree these transition moments could have been handled better with Node being an a fully active participant instead of recipient of these challenges.

Node has a big opportunity to become a full fledged user agent (web browser) and first class support for web features will be a part of that. +1!

jasnell commented 6 years ago

http2 is an argument against being too eager to absorb protocols into core. Neither its API nor its implementation are all that great;

PRs welcome.

Re: websockets

I'm still -1 for the time being. This is something that has been done quite well by userland and there are still unanswered open standards questions about http2+ws that require more thought and experimentation.

watson commented 6 years ago

When http2 was added, I think one of the considerations were that we could do a lot of low level stuff in C++ land that was hard or even impossible(?) to do efficiently in user-land. Is there a similar reason for wanting to bring WS into core, or is it just to have more features?

MylesBorins commented 6 years ago

@jasnell can you point me towards the open standards discussion regarding h2 + ws?

jasnell commented 6 years ago

The ietf httpbis working group is working on this .... https://tools.ietf.org/html/draft-ietf-httpbis-h2-websockets-00

It's still super early in the process tho there is some early implementation happening.

kof commented 6 years ago

Ideally there should be a clear general framework on how to decide whether something belongs to the core or not. I would consider points like

addaleax commented 6 years ago

is it hard to be done right?

@kof I fully agree with all your points except this one – there is no reason to believe that code in Node core is better-written or better-maintained than userland code in general.

I think a better question in its place would be β€œIs it hard to do effeciently without native addons?”.

kof commented 6 years ago

@addaleax The idea behind that point is: if it is in userland, it is likely there will be multiple competing implementations, which will result in attention not being fully focused on a single code base and as a result lower quality, just because less people are working on it or using it. I have no data to back it up.

Userland is good when a feature needs experimentation.

lpinca commented 6 years ago

@kof

is it hard to be done right?

No.

is it widely used?

Yes it is.

does it have a clear spec?

Yes.

is it likely to become obsolete in the near future?

No.

does it need optimizations from the core?

Not necessarily but optimizations in core may help. In ws we have two optional binary addons. One for the masking/unmasking of frames (bufferutil) and one for the UTF-8 validation (utf-8-validate).

will the entire community benefit from it being part of the core?

Yes I think the community will benefit from WebSocket in core.

There are a lot of userland implementations. The most popular are ws, faye-websocket, and websocket. Combining the the best parts of each one of them to create a single module in core would be great imho.

mcollina commented 6 years ago

+1, mainly because @lpinca has been doing a great job in maintaining ws the last few years.

MylesBorins commented 6 years ago

pinging @jcoglan and @theturtle32

watson commented 6 years ago

So far I haven't seen any concrete comments on why implementing WS in core would be better than having it in user-land. I'm sure @lpinca have done a great job maintaining ws, but that hardly qualifies as a reason to bring it into core right?

I just think there need to be concrete examples on how core can provide a better WS implementation than user land πŸ˜ƒ

devsnek commented 6 years ago

i don't think there's anyone saying we would do a better job. i think the argument is better integration and since we can ship it with a native backing users will get a perf boost without having to install anything.

watson commented 6 years ago

@devsnek Could you provide an example of how the integration would be better if WS was in core?

Regarding native backing then n-api and pre-compiled binaries should remove the requirement of having users compile stuff (if I understood your argument correctly).

TimothyGu commented 6 years ago

It seems the main argument for adding ws to Core is to reduce binary dependencies. Since both native dependencies are buffer-based, I think they may be good candidates for adoption of WebAssembly. @lpinca has that been seen as a possibility?

MylesBorins commented 6 years ago

For an API, I'd like to propose the WhatWG WebSocket API, as much as we can implement the standard (I know there will be some inconsistencies around events).

If we run into any serious problems with the API design we can attempt to work with the whatwg to update the standard. It seems like @TimothyGu has participated in this document along with @domenic.

re: http/2 + ws, if we get an implementation going we can participate in trying to solve this problem!

will the entire community benefit from it being part of the core?

I think implementing standards in core, especially ones implemented in the webplatform, can help the ecosystem focus on creating better abstractions on top of a protocol like WS, and avoid having to implement the base protocol themselves. If the various ecosystem modules share a kernel it allows there to be collaboration across projects with a shared interest, and potentially brings more people to help maintain core itself.

lpinca commented 6 years ago

@watson userland implementations work and I'm a big supporter of small core but HTTP(s) and many other modules are in core because they are so popular that it made sense to create a core module for them. The same is valid for WebSocket in my opinion. Quoting @domenic from https://github.com/nodejs/NG/issues/10#issue-59860208

I think the conclusion of this line of thought leads us to a process wherein we say "yes, core is interested in supporting feature X." Then, someone---maybe an io.js collaborator, or maybe not!---goes off and builds a prototype of feature X as a user-land module. Along the way, they might need to ask for more low-level APIs from core, and those will get rolled in. But over time, this external implementation of feature X matures, and io.js collaborators start commenting on it, and the ecosystem starts using it, until we decide: yes, we should roll this in to core, and start shipping with it.

Is core interested in supporting WebSocket? That's an open question. I think it should.

@TimothyGu we didn't explore that possibility but that's definitely a good idea. We already have prebuilt and n-api based binaries for native addons. Removing binary dependencies is not the reason why I would like to see WebSocket in core though. The question is, why is HTTP and HTTP2 in core and WebSocket isn't? They all are in the same league and they all can live in userland if wanted. On top of that a good part of the WebSocket implementation is already in core as the upgrade mechanism is already available and supported in core.

@MylesBorins If we ever decide to experiment with WebSocket in core, API need to be discussed. I think we want to support piping data around so a websocket should be implemented as a Duplex stream like faye-websocket does. I guess we also want to support fragmented messages and permessage-deflate. In this case we should augment the standard send() to add the ability to specify if a message is the last one in a fragmented message and to specify if a message should be compressed or not.

M3kH commented 6 years ago

I can't establish a vote for this, although would be a nice to have.

I found that WebSocket implementation isn't that solid neither in the Browser.

Is high cpu consuming and so far I couldn't see any implementation in a big scale (which I would have guess they would be using it); eg: Facebook, Twitter and Gmail, they rather relay on polling.

I guess is mainly related on the cost created on the Server side due being really hard to scale it (eg. Load balancing). Not sure Node wants to make his capacity in supporting it.

lpinca commented 6 years ago

I would argue that it's actually easier to scale WebSocket than HTTP polling and it's way more CPU efficient. The protocol was designed to solve the HTTP polling problems, see https://tools.ietf.org/html/rfc6455#section-1.1. With WebSocket you have a persistent TCP connection, no HTTP headers overhead, seamlessly integration with the cluster module (no need for sticky sessions).

kof commented 6 years ago

@lpinca what happens on frequent reconnects if you don't have sticky sessions?

lpinca commented 6 years ago

@kof with WebSocket it doesn't matter as long as the TCP connection is established, all messages are sent over the same connection so they all hit the same server. Sticky sessions are required for HTTP polling.

kof commented 6 years ago

@lpinca By frequent reconnects I ment reconnects on the TCP layer (when connection is dead for real)

lpinca commented 6 years ago

@kof not sure I understand. The websocket is closed along with the underlying TCP connection and a new one created.

kof commented 6 years ago

Isn't it similar to long polling from performance perspective on bad physical connections at scale?

lpinca commented 6 years ago

Yes, but ideally the majority of TCP connections are stable no? And even in the worst case scenario it's still better than long polling.

jasnell commented 6 years ago

Given the discussion I'm coming around on this, but what I'd like to see is a process similar to what we followed with N-API and http2... specifically, work on the new module being done and proven out in a separate fork repo before work being committed here in the main repo. Doing so allows greater flexibility in experimentation and implementation without disrupting anything else happening in core. It also allows implementation progress to be made in advance of a full commitment to "landing" it as a feature.

theturtle32 commented 6 years ago

I'm mostly indifferent to whether WebSockets should be in core at this point, but I lean toward the opinion that it should not. I don't feel it's anywhere near as widely used or fundamental as HTTP is, and it would end up just being more for the core team to have to maintain. That being said, my WebSocket implementation hasn't required much maintenance since becoming stable.

Honestly, I kinda feel like it doesn't really need to move into core. As it stands, the community benefits from multiple competing stable implementations, without there being so many that it's difficult to pick one. Each implementation has its own particular API that fits better for different coding styles or use cases. If Websocket was implemented in core, it would instantly become the One True Implementation and the One True API, and I'm not sure that's especially beneficial. Alternatively, it may try to be all things to all people by supporting multiple kinds of APIs, and that would create more surface area to maintain as well as more planning and design work up front. It may end up being less good than what we already have now, at least until enough work has been put into it to evolve it to something stable. And all that work would essentially be just re-inventing the wheel.

Core optimizations to support Websocket implementations might be nice. Or a new optimized C++ implementation that does a majority of its processing work off the main thread might be interesting (Websocket perfectly lends itself to an entirely async API anyway), but I'm not sure even that kind of project would need to live in core.

lpinca commented 6 years ago

And all that work would essentially be just re-inventing the wheel.

The same is true for all of us working on competing userland implementations. We are writing the same parser, the same extension parser, the same frame builder, the same extensions, the same < insert WebSocket detail here > instead of solving the same problems together.

daynin commented 6 years ago

I totally agree with @lpinca. Node is already some kind of a core for these libraries which provide WS, so it should bring maintainers of these libs together.

Each implementation has its own particular API that fits better for different coding styles or use cases

Fragmentation of implementations is not good for any standard. Look at browsers. They all have different implementations of JS, CSS, etc and it's a pain for developers and users

goloroden commented 6 years ago

-1 for the very same reasons as given by @mscdex. I think that already having added http2 support was wrong with respect to this.

Apart from that it’s hard to get the feature set right. Should it support reconnection management? wss? Middleware?

This quickly becomes too much, but if you don’t do it people will still stick to userland solutions.

lpinca commented 6 years ago

Should it support reconnection management?

No, not a core business.

wss?

Yes, the only difference here is using the https module instead of http. This is transparent to the implementation, it's exactly the same code.

Middleware?

No, this is not a framework. It's not Socket.IO, SockJS, or Primus. The aim would be to add a building block for those framework. A low level module like http which is basically used by all userland modules that need to work with that protocol.

goloroden commented 6 years ago

Exactly. That’s my point: If you need the userland modules anyway, then why introduce it in core?

lpinca commented 6 years ago

I'll answer with another question. If you need express or hapi, why having http in core? πŸ˜„

goloroden commented 6 years ago

TBH, I don’t know (and seeing that projects such as https://github.com/mafintosh/turbo-http are started shows that it is not even needed to be in core).

My understanding was always that core should only include things that are so vital to Node.js itself that they can’t be done in userland.

lpinca commented 6 years ago

So you basically agree with https://github.com/nodejs/node/issues/1010#issuecomment-85207951. Yes, that's a valid point.

goloroden commented 6 years ago

Right.

watson commented 6 years ago

As I see it there's two ways to look at whether something should be in core:

  1. It can be done better/more performant in core than outside of core
  2. It's a widely used module/technology

Some people in this thread use point no. 1 as a criteria while others use point no. 2. As long as we don't agree on the premise of when something should be included in core, we'll not have a constructive debate.

I think this is a very important distinction and something that isn't related to WS at all. Before we deal with whether or not WS should be in core, we need to come to a consensus on this topic first.

jcoglan commented 6 years ago

Going to drop some thoughts here since @MylesBorins tagged me in. For context, I maintain the Faye collection of packages: faye, faye-websocket, websocket-driver, websocket-extensions and permessage-deflate. The Faye project dates back to Node v0.1. I also maintain Ruby versions of these packages (websocket-driver is the WebSocket implementation in Rails), which is an interesting source of comparison and has influenced the design of the Node versions. I don't really have an opinion on whether WebSocket should be in core, because I don't maintain core, but I do have some thoughts that might be useful if you do decide to include it.

I can talk about what's good and bad about working with WS in Node and then wrap up with some thoughts on API. First, good things.

A single HTTP server. Ruby has a dozen competing I/O and concurrency frameworks and associated web servers, which makes targetting any networking abstraction at a large user base difficult. This is the reason websocket-driver exists; it separates the WebSocket protocol from the I/O implementation. Node putting a good HTTP server/client in core means it has not generated the same competitive ecosystem in this space. This has ups and downs but one up is definitely that it enables more stuff to be built on top of it and be widely useful; like Promises it's a case of a standard being in core enabling interop in the rest of the ecosystem.

HTTPS/TLS. The TLS abstractions in Node are much easier to use than what I've seen in other ecosystems, and their API is similar enough to their plaintext counterparts that deciding whether you want a plaintext or encrypted connection nicely separates from any other concerns, at least in terms of API surfaces.

Streams. The abstraction that allows websocket-driver and faye-websocket to be piped into other streams without having to know how the destination works is really valuable. In order to contend with all the I/O implementations in Ruby we had to essentially invent a read/write API for ourselves, and it's not without problems. I've had conversations with maintainers of Ruby servers about how we should handle back-pressure and I essentially told them to do what Node does.

Buffers. Having a data type that specifically represents a block of bytes that is distinct from strings is a huge ergonomic win.

Performance. The Node implementation of websocket-driver performs well enough that I've not been tempted to use native code. In Ruby, masking is enough overhead that I wrote that in C. I've also tried porting the entire parser to C but the performance win doesn't seem worth the huge added security risk.

Now, things that could be better. I'm totally prepared to admit some of this is down to my own lack of knowledge; I don't have a huge amount of time to dedicate to this work so I might be holding a few things wrong. Also a bunch of this was developed years ago when I knew less.

HTTP parsing. It's really weird to me that there's an HTTP parser in core but it's private and the advice is to use a third-party module that copies its API. If we put protocols in core, being able to use their parsers but bring our own I/O control is a huge win. websocket-driver used the core HTTP parser until recently and I'm not sure why it couldn't be made public. Maybe there's something in the http module I could use to send a request and be given a parsed response object and the still-open TCP connection. But in any case there's other complexity to contend with once you add proxies to the much and I was more comfortable setting up a TCP socket myself and then doing all the HTTP on top of that.

UTF-8. Validating UTF-8 is a bit tricky to do; I'm resorting to converting a buffer to a binary string and using a regex. It would be nice if there was a streaming UTF-8 validator/decoder in core. Perhaps there is already a way of doing this with encoded streams? The thing is, implementing UTF-8 is not actually hard (if all you want is "is this a valid UTF-8 byte sequence", ignoring all sorts of unicode semantics), I've done it in other places, but it didn't seem worth the effort here. The regex probably performs better than anything I could do in pure JS anyway. Ruby has a baked-in method for this; String#valid_encoding?.

Streams2. I never adopted the Node v0.10 streams API. At first this was because I was supporting older releases, but now it's because I never figured out how to make streams2 do what I want. In particular, it didn't seem as easy using the basic read()/write() APIs to maintain message boundaries (i.e. separate messages should show up as distinct data events), and to emit strings and buffers (which we use for text and binary frames) on the same stream. I've tried to figure this out a few times and given up. The current implementation uses old-style streams. It's extremely basic and probably wrong, but nobody has ever filed an issue about it.

Concurrent processing. There is a whole abstraction inside websocket-extensions that deals with asynchronous processing in extensions, i.e. passing messages through transforms that might be async, in two directions, maintaining message order and closing extensions when all pending messages have passed through them. This was a huge headache to build and it's effectively a single reduce() call in Ruby. I'm not sure what abstractions should exist to make this easier; it's possible that Promise and async functions might ease the pain. This is also more of a JS problem than a Node one, but building this made me keenly aware that JS still has a way to go to make concurrent processing easier.

The upgrade event. Separating some HTTP requests off like this is annoying and makes it harder to build composable middleware or add WebSocket support into existing stacks. It's usually easier to stand up a separate server that knows how to do WebSocket and have that be different from your app server. This isn't a massive problem because there are also operational reasons for doing this, but having the option of making composition easier would be nice.

Those are things that could be improved to make implementing WebSocket easier. If WebSocket were to go in core, here's a few things I think are worth bearing in mind. I'm not arguing you should adopt my code or API here, just listing some user needs/benefits that an implementation should support.

First, the API. Starting with the WhatWG spec is good. In Faye, the integration code for WebSocket is almost identical on the client and server because faye-websocket implements the same API as the browser. However, it's not enough. On the server, you want control of what extensions are in use, how they are configured, which protocol versions you support, sending ping/pong frames, additional handshake headers, using cookies with the client, using proxies (tunnel-agent helps here), and other things. The API needs to be expanded to allow all that.

Second, it would be good to separate the protocol from the I/O layer, as websocket-driver has done. I initially thought that design would be more relevant in Ruby due to the divergence in I/O libraries, but many modules in npm depend directly on websocket-driver and not on a higher-level package. I think this will be especially useful as HTTP/2 becomes more important, letting people pipe the WebSocket protocol into whatever stream they can construct over an H/2 connection.

It is tricky to say where that separation should lie. I've recently been asked to ship something that avoids doing the HTTP handshake on the client; the user wants to do that themselves and then hand the TCP socket over to websocket-driver for it to take over. However, the driver needs information from the handshake phase such as which version was negotiated, which extensions are enabled with what parameters, etc, and the thing making the handshake request needs to know what extension settings the driver supports. Some versions of WebSocket make the handshake leak into the request body as well, which makes things harder to separate. But if you're OK with letting websocket-driver send the HTTP handshake, you can pipe it into any connection you like.

That brings me to my final point, which is extensions. In websocket-extensions, I attempted to build something that would allow WebSocket libraries to interoperate with extensions in a pluggable way. Most users of websocket-driver use it indirectly, via faye-websocket, faye, or one of the hundreds of modules build on top of them. Rather than every one of those intermediate packages providing an interface for configuring extensions, websocket-extensions means extensions can be passed into a WebSocket stack as first-class objects that carry their own config. To configure permessage-deflate, you only need to know the interface to that module, and then you can pass an instance of it into anything built on top of websocket-extensions. This means other extensions could be built and added to existing stacks without every dependency between your app and the WebSocket library having to add support for it. They only need an interface for passing extensions in and they don't need to care what the extension is or how it works.

This also has an important effect on the dep graph. Something that uses faye-websocket and permessage-deflate has this graph:

.
β”œβ”€β”¬ faye-websocket
β”‚ └─┬ websocket-driver
β”‚   β”œβ”€β”€ http-parser-js
β”‚   └── websocket-extensions
└── permessage-deflate

When we had the problem with zlib causing a DoS, users were able to cope with the situation just by removing permessage-deflate from their setup, and they still had a working WebSocket stack. They were able to update just the permessage-deflate module without waiting for intervening deps to publish updates, as often happens with modules with tightly pinned dep versions. This is possible because if you're using the extension, you have it as a direct dependency, rather than getting implicitly via a chain of other packages.

This design is not something I've often seen in standard-library implementations of things. Such implementations are usually more of a black box and don't provide API contracts that can act as extension points. Such contracts allow the module not to be a bottleneck on users extending something, and much of Faye is designed with this is mind.

Ultimately I think the benefit of putting things in core is that they enable ecosystem interop as I mentioned for HTTP earlier. Whether this is a big deal for WebSocket I can't say; I think most web frameworks just embed a WS library they like and if they do expose it to the user they put their own abstraction on top. I'm not aware of much demand for people going "I like Framework X but can I swap in my own WS API", or wanting to compose things atop WebSocket such that it acts as a common interface. I might be wrong though.

addaleax commented 6 years ago

@jcoglan I can’t speak much to the websocket-specific stuff, but this was a very helpful comment to me, but I would be interested to hear more about two things you mentioned:

It's really weird to me that there's an HTTP parser in core but it's private and the advice is to use a third-party module that copies its API. If we put protocols in core, being able to use their parsers but bring our own I/O control is a huge win.

Yes, that was weird to me too, so I opened https://github.com/nodejs/node/pull/16267 to allow the built-in http module to work with generic Duplex streams as the underlying resource. If you have thoughts or feedback on that, or more details on what you think would be helpful to have in core to support this, I would love that.

Validating UTF-8 is a bit tricky to do

It should be easy to get this into core if we want to. I don’t know about previous efforts, but if you have suggestions for how the API should look like, maybe just open an issue on this tracker?

lpinca commented 6 years ago

As mentioned above we are using https://github.com/websockets/utf-8-validate for UTF-8 validation. It works directly on a buffer, no regex needed and has a pure JavaScript fallback. It doesn't work in a streaming fashion though.