socketio / socket.io-protocol

Socket.IO Protocol specification
https://socket.io
507 stars 62 forks source link

Communication Protocol Specification #14

Open uiteoi opened 8 years ago

uiteoi commented 8 years ago

This specification needs a lot of work to become a true communication protocol specification.

A protocol specification should allow at least:

A protocol is not an API, although procedures should be specified, a protocol is independent from APIs, meaning that the same protocol may be implemented with several incompatible APIs that may offer various levels of abstractions.

A reference implementation may be provided but without a proper protocol specification it is impossible to tell if the reference implementation:

Furthermore, a reference implementation includes both implementation of the protocol plus an API, making it harder to understand the protocol itself.

A protocol may have bugs, and security issues, without a specification, it is impossible to evaluate if there are such issues.

At the lower level, a protocol must specify, for each transport, the format for every packet (or message) allowed.

At the higher level, procedures explain how and when these packets are exchanged. This includes the description of responses to requests, acknowledges, retransmissions, timers, connection and reconnection procedures, procedural errors, flow control procedures, etc... State machines may be provided to document these procedures.

rauchg commented 8 years ago

@uiteoi this came up also in #13, so we'll be prioritizing it. If you have examples of what your ideal specifications would be, please post them here.

uiteoi commented 8 years ago

@rauchg I have open a new issue after reading these other open issues, because I thought that there was a need for a different approach to this issue emphasizing the scope of the specification as a formal communication protocol. I also have experience writing protocol specifications so I thought I could help although I don't have much time because I am involved full time with the toubkal project which is using socket.io (currently migrating from version 0.9.16 to version 1.3.7 which is taking more time than expected due to protocol and API documentation issues).

There are plenty of examples of communication protocol specifications on the net, many IETF RFC are (e.g. OAuth http://tools.ietf.org/html/rfc6749#section-2.3.1) as well as the WHATWG (e.g. XHR https://xhr.spec.whatwg.org/). One of the closest to the socket.io protocol could be the RabitMQ AMQP (Advanced Message Queuing Protocol) which specification you can read at https://www.rabbitmq.com/resources/specs/amqp0-9-1.pdf.

Reading this, you will quickly understand that writing a communication protocol specification is a significant undertaking so you need a plan to get there that would allow others to contribute as much as possible.

Having a normative approach is probably the most important. Think of socket.io as the reference implementation of the protocol and a possible API, with the protocol as a way to grow your ecosystem to every system that communicates on the internet, regardless of programming languages and APIs. To this extent, it is very important to separate the protocol from the API.

The API implements the procedures of the protocol in a certain way, typically using functions, objects, exceptions (or error messages in callbacks which the protocol does not define). API functions have name conventions, parameter order that do not need to have any relationship with the protocol itself. Yet the API specification (socket.io documentation itself) should probably specify how it binds to the Socket.io protocol procedures at least when non-obvious.

The protocol defines how systems communicates through procedures and message syntax, this usually calls for at least two separate sections for these, maybe more due to a number of transports and optionally more separate specifications for additional transports. Procedures are essential to understand how the protocol works while message syntax allows to unambiguously encode and decode messages between heterogeneous systems.

An additional section should introduce the protocol stating its scope, what it aims to accomplish, maybe give the protocol a name - e.g.. SIOP, the Socket IO Protocol -, definition of terms (entities, client, server, intermediate systems such as reverse proxies, namespace, transports, transport bindings, requests, responses, acknowledges, messages, ..), references to other specifications (for each transport, cookies, JSON), possibly an architecture section showing entities connections.

Injac commented 8 years ago

@uiteoi, @rauchg Jean, Guillermo,

I have no experience in writing protocol specifications like Jean does and I think that the AMQP protocol is very well defined and understandable (including content-frames, pseudo-code, etc.). And I agree with Jean, that this is a huge task to accomplish.

Citing Jean:

"Having a normative approach is probably the most important. Think of socket.io as the reference implementation of the protocol and a possible API, with the protocol as a way to grow your ecosystem to every system that communicates on the internet, regardless of programming languages and APIs. To this extent, it is very important to separate the protocol from the API."

I want to add another interesting point to this: IoT. IoT will explode in the next years and it will add another interesting usage-scenario to socket.io's ecosystem. Existing and new up-coming systems will definitely profit from a detailed protocol spec.

In my specific case I am looking forward to implement socket.io completely for the .NET Micro Framework. There are libraries out there - outdated and incomplete - not supporting all nuances of socket.io and sometimes are not usable at all. I want to implement socket.io for NETMF, the right way. This is why a detailed protocol specification would be of tremendous help.

uiteoi commented 8 years ago

There are lots of propositions to use WebSocket for IoT devices. Main concerns are performance-related, memory, bandwidth, CPU and competition with a number of established protocols including AMQP and MQTT which both offer Publish/Subscribe functionality. For these reasons the main market for WebSocket remains Web clients for which it was designed and is well-suited and for which HTTP and WebSocket payloads overhead is not an issue.

From a protocol standpoint, the most important for Socket.io is its capabilities to handle legacy browsers and architectures that fail WebSocket, such as somes corporate proxies. The protocol need to explain fallback procedures.

Injac commented 8 years ago

@uiteoi , @rauchg I think this is true for now. In a few years this will be very different and IoT devices will become smaller and more powerful and they will run things like Node.js without any further problems. But in general I agree with you on what you said. The main question to you as a protocol specification expert (I assume you are, no sarcasm) is: Where and how to start? How would such an approach look like? And what can I do to help?

rauchg commented 8 years ago

One thing to consider is: the primary goal of a robust and well-defined spec is to ease in the implementation of servers and clients (or to assess their robustness and security).

The main thing we're missing to aid that goal right now is not the spec, but tools to automatically test your implementation.

It seems to me that code is a much better specifier than a PDF. Maybe we start by building a test case that goes from primitive (ensuring basic control packets are in place) to more advanced (exchanging events, connecting to namespaces, getting errors, etc)?

rauchg commented 8 years ago

Such a test case should make it very easy to test your server implementation against. Example:

$ java -jar socket.io.jar 3000
$ ./socket.io-test-server localhost 3000
nuclearace commented 8 years ago

@rauchg that would be very helpful

uiteoi commented 8 years ago

If you implement a protocol validator this is great but you still need to have a protocol specification to allow people the implement the protocol and to verify that the protocol validator is valid.

Injac commented 8 years ago

@rauchg Very useful, indeed. And I agree with @uiteoi as well.

rauchg commented 8 years ago

We're all on the same :page_facing_up:

uiteoi commented 8 years ago

@injac, IoT devices have electrical power constraints very far from what we are used to in the web server world.

Some of these devices run on milliwatts and have a few kilobytes of RAM. Think about an energy-efficient LED light that when it's on consumes a few watts and that you can control remotely when it's off. If it consumed more than a few milliwatt when idle this could defeat the energy-efficiency goal. An Arduino compared to this is a supercomputer.

When they need more computing power it is assumed that they use the services of a more powerful server, that server could run node.js and much more, but communicating with the server needs to consumes as little resources as possible.

This may change of course as computing power will continue to increase but you should expect at least 10 years before running node on the lowest power IoT devices.

As far as protocol design is concerned I would not call myself an expert, but as mentioned before I have some experience in the field and did design quite a few protocols in the past.

Injac commented 8 years ago

@uiteoi I agree with you on what you've said about the IoT devices and like you said, this is valid for some of them, like low power and low memory devices (BLE for example, varius MCU's and so on).

I am talking about boards, that have way more power, like a Fez Spider for example which don't run any OS, but bare-bone .NET (like NetMF). Those devices have enough power and RAM to handle this as well as MQTT, AMQP and various other protocols. This devices can act for example as message-dispatchers (a kind of pesudo messaging-server) within a LAN, for example or another kind of closed network. There are plenty of scenarios. A Raspberry could be used as well, but that is really often a question of size (form-factor).

If you and all others don't mind, I would love to be integrated into the overall-process to be able to contribute to the socket.io community.

uiteoi commented 8 years ago

@Injac agreed, socket.io is perfectly suited to act as a gateway to low power IoT devices.

I believe you are welcome to contribute, but I am not an owner in this project :)

nealmcb commented 8 years ago

I'm glad to see agreement that there needs to be a clean spec for the protocol itself. I think of it as the bits on the wire, in broad terms. IETF RFCs are a great model, as noted. A small step in that direction would be to provide an annotated packet dump. That would suit my present purposes, and at least give a hint as to how it relates to other protocols, how efficient it is, etc. Does that exist anywhere?

rauchg commented 8 years ago

@nealmcb not that I know of but happy to contribute to it!

jroper commented 7 years ago

+1 on this. I'm trying to implement a server implementation of socket.io right now, and this document is really not a specification at all, it says nothing about what the protocol should look like on the wire, the only way I can work that out is by firing up an example application, and reverse engineering it using Wireshark - a specification is something that should allow someone to implement the protocol without needing a reference implementation, and certainly without having to reverse engineer it themselves.

AL1L commented 5 years ago

Bump?

hbprotoss commented 4 years ago

Any progress?

lex-talionis-emmainternational commented 4 years ago

Three years later and I'm in the same boat as jroper. I've got a poorly documented device (a temperature datalogger, of all things) using a poorly documented protocol (sorry!) and it's a reverse engineer's worst nightmare. Even just a few session logs showing what encoded data is supposed to look like would be helpful.

darrachequesne commented 4 years ago

I've started adding examples to the underlying Engine.IO protocol for v3 (included in Socket.IO v1/v2) and v4 (which will be included in v3).

It's far from being a complete specification though. I'll add some examples for the Socket.IO protocol itself.

Edit: I've edited the README, if you find anything not clear enough, please tell me.

darrachequesne commented 4 years ago

Let's keep this open for now, we'll work on this once v3 is out.

For reference: https://github.com/sockjs/sockjs-protocol