nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.51k stars 29.56k forks source link

Ship websockets in core #1010

Closed piscisaureus closed 9 years ago

piscisaureus commented 9 years ago

Websockets is a standardized protocol that is core to the web. I think it should ship in core.

Discussion taking place here: https://github.com/joyent/node/issues/8005

mscdex commented 9 years ago

-1 WebSockets is still something better suited for userland IMHO. Sure it's something that many people do use, but there are also many other widely used standardized protocols/APIs that could be argued as "core" to the "web" that are not currently included in node/io.js core. For example: multipart parsing/generation, Server-sent events, HTTP/2, ICMP, SSH/SFTP, FTP, SOCKS, VNC/RFB, SMTP/IMAP/POP3, SOAP, Web Workers (as an API), XHR/XHR2 (as an API), etc.

The main complaint in joyent/node#8005 had to do with an issue in node-gyp and even then the node-gyp issue was not something that affected the installation and use of the third-party module in question.

There was also a different discussion in there about the need for easily distributing compiled addons for Windows users, but as someone already noted there, node-pre-gyp already provides a mechanism for solving that problem. Maybe npm should incorporate something like that, but that's a separate topic.

jbergstroem commented 9 years ago

My problem with issue joyent/node#8005 is that it grew from ws to bundling compilers. Would the crowd in that issue be happy with "just" supporting ws in core? Also, supporting ws but for instance not http2 is a pretty tough decision that probably is better to avoid unless there's a clear path forward every time the decision about what belongs in core decides to show up.

Qard commented 9 years ago

FYI: The ws module is pure JS now. The last bits of native stuff have been pulled out to optional dependencies. So that Windows issue isn't really relevant anymore, if people just update their dependencies.

benjamingr commented 9 years ago

What advantages would putting this in core offer?

yosuke-furukawa commented 9 years ago

+1 : websocket is implemented by core members. -1 : websocket is in core API.

I think core module would better to keep simple. ws are a little complicated.

IMO, websocket is not core API, but websocket should be implemented by core members like readable-stream.

Golang team chooses similar strategy. They has official but not core API like these. websocket is listed in the link. spdy is also listed.

And I have a tiny concerns about einaros/ws module, the module does not have main active committers.... So, I have highly recommend core members implemented websocket as a SUB-STANDARD module.

Qard commented 9 years ago

I've seen the topic of "official" userland modules come up a couple times now. Perhaps it could be a topic for a future TC meeting?

benjamingr commented 9 years ago

I'm definitely in favour of userland official supported modules - decoupling shipments from core and modularizing code sounds like an excellent idea for io and will allow better interop and more modular code and deployment.

tellnes commented 9 years ago

When you are asking for official supported userland modules, then what you really is asking for is a repo under the iojs org and a WG for that project? Maybe the possibility to run project tests on jenkins?

The problem with calling those officially is that it implies some guaranty. If a official module is abandoned by its WG, what happens then?

cjihrig commented 9 years ago

+1 for a blessed module outside of core.

benjamingr commented 9 years ago

@tellnes yes - this is exactly what the point is - under the iojs org and WG. The whole point is indeed that guarantee - that it will not be abandoned by the WG and that "grown up"s give it the same LTS support as shipments get.

Fishrock123 commented 9 years ago

I'm not sure about adding this to core. http is different. Since we already have http and have to maintain it for the long run, adding http/2 support is more reasonable as it is only maintaining an existing core module.

piscisaureus commented 9 years ago

What advantages would putting this in core offer?

There would be no need for compiling a native addon in order to use websockets.

benjamingr commented 9 years ago

@piscisaureus Does this have to be written as a native addon?

Sorry for the native question - but why can't this be implemented in (low level, closure free) JavaScript?

piscisaureus commented 9 years ago

BTW I'm not saying that we should put "ws" into core, although it'd be an obvious starting point for the exploration. But from an API perspective, the websockets implementation should integrate well with the built-in http(s) server, and be as simple as possible.

piscisaureus commented 9 years ago

@piscisaureus Does this have to be written as a native addon?

Sorry for the native question - but why can't this be implemented in (low level, closure free) JavaScript?

Many websocket consumers (e.g. socket.io) opt to use a parser written in c++, for performance reasons. There is a much slower fallback implementation in pure javascript, used only when the c++ parser fails to build.

Imagine what a novice node user has to deal with when installing socket.io. On 'npm install' they'll see a bunch of errors and obscure warnings, which in most cases will leave the application in a nonfunctional state. Except that when it's ws that fails to compile, they'll see a bunch of errors, but the app will still work. Websockets however have become slow, which the user may not realize at all, and if they do they may not realize it has anything to do with that failed compilation step. It's really a mess.

mscdex commented 9 years ago

You do not need to compile a native addon to use WebSockets though. The ws addon dependencies were merely optional helper functions, but as of websockets/ws#c84c880cbc they are explicitly set as optionalDependencies so it's even less of a problem now.

mscdex commented 9 years ago

Many websocket consumers (e.g. socket.io) opt to use a parser written in c++, for performance reasons.

socket.io uses ws which uses a js parser. The C++ stuff was just for performing miscellaneous tasks like masking and such.

piscisaureus commented 9 years ago

You do not need to compile a native addon to use WebSockets though. The ws addon dependencies were merely optional helper functions, but as of websockets/ws#c84c880cbc they are explicitly set as optionalDependencies so it's even less of a problem now.

I know - the point I was trying to make is that the UX of using an optional dependency that fails to compile is abysmal.

socket.io uses ws which uses a js parser. The C++ stuff was just for performing miscellaneous tasks like masking and such.

Ok, so parts of the parser are written in c++ but not entirely. That's beside the point; clearly there are good reasons to have c++ bits. It's not that different from how http works in core (the parser is written in c, but a large proportion of logic is written in javascript).

benjamingr commented 9 years ago

Many websocket consumers (e.g. socket.io) opt to use a parser written in c++, for performance reasons. There is a much slower fallback implementation in pure javascript, used only when the c++ parser fails to build.

Well:

Obviously, you've spent a lot more time than me researching this - I'm just playing 10th man here to make sure we're not ruling a pure JS implementation that would be a portability win.

vkurchatkin commented 9 years ago

With v8 now supporting asmjs directly

Why do you think it does?

benjamingr commented 9 years ago

@vkurchatkin my bad, it's assigned - it appears like it's waiting for TF to get actual support. Still shouldn't change the first point. That said earlier issues in node-forward don't look too promising on that front https://github.com/node-forward/discussions/issues/16 .

llafuente commented 9 years ago

websockets should be in the core only if are implemented at c level.

Performance matter that's why ws has a compiled version, also compatibility that why ws has a JS counterpart.

but: soon we will start about discussing BSON and other modules that gain a lot to be compiled... at the what we need it's NPM to compile those module for us in a decent /easy to use way.

chrisdickinson commented 9 years ago

With regards to bufferutil and utf-8-validate, would it be generally useful to add {un,}mask, merge, and utf8validate to the Buffer API? If so, we can avoid vendoring a fairly complex chunk of functionality.

rvagg commented 9 years ago

FTR, from the TC meeting last week:

See #1123, removing tc-agenda label

sam-github commented 9 years ago

My take-away from the TC meeting was that even @piscisaureus doesn't want ws in core, ATM, but that the buffer operations ws uses, which are what it usually uses the compiled addon for, should be in core: https://github.com/iojs/io.js/pull/1202.

Bert, maybe you should close this?

3rd-Eden commented 9 years ago

As stated by many others the discussion in joyent/node#8005 transformed to be more about distributing binaries. The node-pre-gyp can be seen as somewhat of a solution for this. The problem for this is that:

  1. Module authors need to take care compilation and uploading of binaries.
  2. Provide the hosting for the binaries.

This is all nice and all for small modules that nobody uses but you really don't want start paying large sums hosting costs just in order to have people install binary add-ons. In my opinion this all something that npm should solve. When you publish a package, they should just pre-build the binaries on their infrastructure so when you install a module they can just fetch the correct one from their servers and make sure that the module works as intended. But that is out of the scope of this discussion.

@quard The ws module has always been pure js. The binary parts have always been meant to be optional which is also why it had this horrible install script hack in the package.json so it could let the installation fail. So even when the compilation failed it would still work. But this approach was obviously flawed which is why I extracted and moved everything to separate projects and used the optionalDependencies field.

As for the binary parts of ws they are used for two things:

  1. Performance improvements for working with masking/buffers. This yields major performance improvements.
  2. Workaround missing UTF-8 validation in JS which is required by the protocol.

Anyways the original discussion is about getting WebSockets in core. I know that a few years ago a lot of people expressed their opinions in to getting WebSocket support in to Node. As one of the maintainers of ws I also don't think it makes sense to just bluntly put ws in the core of io.js as the current eco has shown, there are many ways of implementing a WebSocket API. With ws we decided to follow the W3C specification so it's easier to use with client side code while others follow Stream based approaches as this is was more "node" like. But what these projects all share is that they need to handle the WebSocket framing and extensions. We currently all wrote our own implementations for this, which is quite foolish of all of us.

So what I would like to propose is instead of getting WebSocket server like support in, I would suggest getting support the protocol parsing/encoding and masking in the core. So all these modules can benefit from the same work, allows the core to track for potential performance regressions if work on buffers or stream internals are changed and would allow us to optimize WebSockets directly in core which will benefit all module authors. Protocol parsing makes sense as this is used in both the client and server part of WebSocket communication and can be implemented as a Stream without being to biased about API.

Also, it might be wise to actually highlight authors of the various WebSocket modules so they can all contribute to this discussion. Instead of finding out through twitter..

cc @jcoglan @miksago @einaros @theturtle32 @andrewrk @rauchg @brycekahle @glasser @lpinca Opinions?

rauchg commented 9 years ago

as the current eco has shown, there are many ways of implementing a WebSocket API.

This is especially true for HTTP as well, and HTTP is in core.

That said, I wouldn't assign bundling a WebSocket server a high priority. I don't think there are any problems right now with userland implementations that would be solved by supporting it in core. We have cross platform compatibility, ease of use and good performance in ws already. We also don't have issues with people finding it.

Fishrock123 commented 9 years ago

For posterity: The current TC suggestion was to implement the required buffer methods, ala #1202

lpinca commented 9 years ago

The current TC suggestion was to implement the required buffer methods, ala #1202

This is very good IMHO, it's a win for everyone. If I'm not wrong the TC also talked about an "official userland module" for web sockets and I basically agree with what @domenic has written here https://github.com/iojs/NG/issues/10#issue-59860208.

andrewrk commented 9 years ago

Incomplete list of things that belong outside of core:

Some things that are not in core but should be:

theturtle32 commented 9 years ago

I agree with @3rd-Eden that we shouldn't just put a particular WebSocket implementation in core. But I'm not really too excited about the idea of just putting pieces of it into core either. Why should core have and maintain (and document as part of the public API) only the frame parser and/or xor masking components? It seems odd to me for those elements to live on their own, elevated to the level of a core platform API. And to what end? To alleviate the frustration with building native binary modules on Windows, but only solve it for WebSocket libraries?

It seems to me the focus be directed toward finding a holistic way to build/bundle/package/distribute _all_ compiled modules for Windows users. It's tough because there are more iojs/node users on *nix-like platforms where we're used to having extremely easy access to C++ compilers, so it's easy to forget about the Windows users and end up leaving them behind.

But at the end of the day, if we just put the pieces we need for optimizing WebSockets directly into core, then there will be substantially less demand for fixing the _real_ underlying platform issue.

On the other hand, this is not a problem that is unique to the npm ecosystem. Ruby, Python, Perl, etc have all had to deal with this as well. I don't know how or even whether they have solved it, since I'm not a Windows user, but it can't be unique to io.js/node.

chrisdickinson commented 9 years ago

@3rd-Eden I'd love to get your thoughts on the PR that adds mask. Also, is unmask necessary in the face of buf.mask(mask, buf)?

@theturtle32 Masking, unmasking, and utf-8 validation are all useful operations on their own, outside of websockets. Bringing them into the core API is net win, and is fairly low-effort for the amount of utility it provides to existing users.

It seems to me [that] the focus [should] be directed toward finding[...]

While there is a larger problem to be solved, tackling this smaller, easier-to-agree-on problem benefits Windows users in the short term and in no way distracts from the efforts of others to tackle the larger issue.

theturtle32 commented 9 years ago

@chrisdickinson Yes, UTF-8 validation is broadly applicable. One might be able to make a case for the masking/unmasking, though IMO the masking algorithm used for WebSockets still relatively niche, using a 32-bit integer as a repeating mask key. Personally, think it seems kind of weird to put a purpose-built, specialized, buffer-based XOR masking function into core.

But I definitely don't think we should put a WebSocket protocol parser/encoder into core, as was mentioned by others, without adding a full WebSocket implementation to go with it, and I don't think we should put WebSockets in core.

I still think that doing something cheap and easy here to benefit Windows users in the short term is a double-edged sword. Yes it will help for now, it will absolutely alleviate frustration for Windows users that install WebSocket-based modules, but that very reduction in frustration will, I believe, dramatically reduce the demand for creating a real solution to the problem of distributing native modules, which will lead to less interest and fewer resources being focused on solving what is essentially the real problem.

Also, both ws and my websocket module will work fine even without the native modules, just somewhat more slowly and less strictly, in that it will accept malformed UTF-8. So to me, it doesn't seem like dire enough of a situation to warrant putting all these things into core. Especially if we can make some improvements to the module installation experience in situations when a C++ compiler isn't available. Novice users can still tinker with WebSockets, and advanced users who are prepared to work with MSVC++ can still deploy the optimized native-code versions.

piscisaureus commented 9 years ago

@andrewrk

Incomplete of the list of things that belong outside of core: «snip»

That's nice and all that, but your list lacks any form of motivation. To demonstrate.

Worst artists of all time (incomplete list):

Best artists of all time:

See what I mean? :)

piscisaureus commented 9 years ago

During the TC meeting at March 11th (iirc) it was decided that we'd just add the buffer operations that are required to build a fast websocket implementation to core. So we won't be adding full websocket support to core.

Therefore I'm closing this issue. The discussion about the implementation continues in #1202.

3rd-Eden commented 9 years ago

If the decision is made to not add WebSocket support in core then you might as well remove the buffer mask as well as I don't really see a broad enough use case here that would justify the addition of it in to the core as there isn't even a demand expressed by the community for it. Adding API's that is only going to be implemented by 10~ people is just stupid. Either ship something workable or ship nothing.

petkaantonov commented 9 years ago

Sorry for off-topic but I am dying to know what problem there is for implementing UTF-8 validation in javascript?

jcoglan commented 9 years ago

@petkaantonov I'm not sure; https://github.com/faye/websocket-driver-node has always done UTF-8 validation in JavaScript and it's worked since Node 0.8 -- see http://faye.jcoglan.com/autobahn/clients/

I think that in 0.8 some fixes were made so that new Buffer(buffer.toString('utf8'), 'utf8') would correctly round-trip astral codepoints, which were failing on 0.6. I've had no issues with UTF-8 since.

theturtle32 commented 9 years ago

@petkaantonov there's nothing preventing anyone from doing UTF-8 validation in JavaScript. It would just be slower than doing it in C++ and the only reason we do it is to be able to strictly follow the specification and immediately terminate the connection with an error if invalid UTF-8 data is received. If we don't have the native extensions built, that level of strictness probably isn't worth the speed tradeoff. Because the worst that could happen if invalid UTF-8 data is received is that some characters could get mangled.

jcoglan commented 9 years ago

@theturtle32 I would consider accepting invalid UTF-8 to be a serious problem. Mangling human-written text is not a trivial problem, and admitting byte sequences you cannot validate into your backend can trigger security problems.

theturtle32 commented 9 years ago

@jcoglan - The reason I'm not terribly worried about it is that none of the major browsers transmit invalid UTF-8 data in the first place. While there are ways to get browsers and http servers to disagree about character encodings and end up with, for example, utf-8 data being interpreted as windows-1252 or iso-8859-1, there's literally nothing you can do in JavaScript to get a browser to transmit a utf-8 websocket message that contains invalid utf-8 data, and websocket servers will always interpret all received text messages as utf-8, because no other character encoding is allowed. So the same possibility of getting incorrectly interpreted characters from your users doesn't exist in practical situations with WebSockets, regardless of whether or not we validate the utf-8 data.

The only time you would ever receive malformed utf-8 data is if a malicious or buggy non-browser client connected to your service. If that's happening to you, then malformed utf-8 still isn't your primary issue.

theturtle32 commented 9 years ago

@piscisaureus @3rd-Eden - While I'm not crazy about adding WebSocket-specific masking to core, I _would_ be in favor of adding something like this for UTF-8 validation:

buffer.toString('utf8', 0, buffer.length, { strict: true }); // Throws on invalid UTF-8
jcoglan commented 9 years ago

@theturtle32 This presupposes that browsers are free of bugs both in their WebSocket code, and in all other network interaction code such that JavaScript cannot spoof headers and fake a WebSocket connection. Such things are rare but they have happened. And just this week I have dealt with our server receiving invalid unicode (application/x-www-form-urlencoded) via a form on our site, for reasons unknown. Clients and proxies can and do mess this stuff up.

Also, since browsers allow cross-origin WebSocket connections, you not only have to trust code running on your site (i.e. assume it is free of XSS holes), but also trust code running on any other site the user has open.

Beyond that, I'd be far more concerned about non-browser clients. If it's possible to sneak invalid UTF-8 past the edge of your stack, that can be used to exploit any system that data is forwarded to. The server must protect internal systems against malformed input.

jcoglan commented 9 years ago

I should also have mentioned that if you're using ws:, or even if you're using wss: and your SSL isn't well-configured, then you can't assume the input actually came from a browser.

eljefedelrodeodeljefe commented 8 years ago

Is this issue closed indefinitely? I was playing around with the ws repo as an experiment and refactored large parts to be compatible w/ core.

Also I see a strong use case for websockets in core, since this is a primary reason why companies not using node at all are making long-term commitments to it anyhow (worked for several). In order to unburden userland contributors and for the greate sake of the ecosystem, I'd give it a +1.

Cons are a multitude: All though the API being really minimal, there currently is a lot of code to digest. Large part of it could use refactoring especially because of Buffer and style issues, imo.

Sorry in advance to add to the noise @piscisaureus :)

d3x0r commented 8 years ago

I also would vote that it be native, and linked to inimately with the HTTP socket that receives it. I mean, sure this looks like HTTP protocol upgrade, but there's no event triggered for an onupgrade... so I guess the weboscket is doomed to be its own port instead of sharing the request on the actual server port?

I mean ya, there is a lot on the browser side to open a websocket to a server... but not so much on what you actually need for a server. I didn't see a recent issue about this and a quick search I didn't find this... but I started another thread https://github.com/nodejs/node/issues/5467#issuecomment-189761292 ... but that is just my C reference.... somehow I had to coordinate registry of "GET /some/page.html" and "GET /some/websock/name" ... so I'm not even sure what the standard IS? I guess it's what ws socket has for opening... When I fist started searching I did find that there are 5 websocket libraries (link in other thread) and there's actually a dozen or so more.

About the only thing this does is add packet-length encoding and packet fragmentation logics. Ya it has opcodes and has keep-alive functions... but primarliy it serves to add a length-data protocol instead of once again rolling my own. (which in javascript maybe I'll just use 4 decimal digits since it's apparently less work than using a javascript plugin, which is a shame.)

I'd imagine that the HTTP server object would have a emit( "upgrade", http_head_upgrade), which could be used to make a WS object from that... and a way to take a standard HTTP reqeuest connection and enable WS on that... (which triggers the upgrade request).

Something different than opening a new socket for the communication.

-- and the masking is just to get it through proxies, and is only a client-side requirement, all the existing implementations puke when the server sends a masked stream... and it's for 'smart' proxies that may do filtering like 'already sent' or 'already requested' on HTTP embedded in the stream or something? It's not meant for secure transport... but it should be reasonably efficient.

I suppose the ping emit and on-ping events could also be exposed?

eugeneo commented 8 years ago

Node.js now includes a partial WebSocket implementation (inspector_socket.{h,cpp}) that is used for inspector protocol.

The implementation is entirely native but only supports features used by inspector (no compressed frames, no binary frames).

3rd-Eden commented 7 years ago

As time goes by our opinions and points of view change with us based on past and new experiences.

I personally feel it's time to re-evaluate this topic now that a lot of time has passed since the final WebSocket specification got released. It is not a technology that has died off since the introduction of P2P technologies like WebRTC. It actually got adopted by that as well for handshaking purposes and what not. WebSocket and real-time technologies are still being adopted by new projects and with browsers moving to automatic upgrades a new era has broken were just using WebSockets directly in your projects is now made possible without significantly hurting your user base.

I still would love to see a WebSocket server and client being shipped in Node.js and there is no need to having this be ws or any other existing project either. I don't care if it's JavaScript based or fully written in C++. As long as it works for every single platform that Node.js supports it's fine by me.

This means there no more need for 10's of competing projects that all attempt achieve the same goal. A developer friendly, fast WebSocket server and client implementation for Node.js. It's just mind blowing to think about the amount time we as authors have wasted by writing (or inheriting), maintaining and hand tuning our WebSockets implementations. I would have rather seen us invest all our energy (and that of our amazing contributors) in to a single implementation that would have been the defacto standard for Node.js.

lpinca commented 7 years ago

Refs: https://github.com/nodejs/node-eps/pull/6