whatwg / websockets

WebSockets Standard
https://websockets.spec.whatwg.org/
Other
47 stars 13 forks source link

Allow programmatic configuration of maximum frame size? #55

Open mccolljr opened 10 months ago

mccolljr commented 10 months ago

What problem are you trying to solve?

I work on a web portal that needs to speak a raw binary protocol. To do this, our application sends binary messages over a web socket, and the server application forwards the frame contents to the TCP socket communicating with the binary protocol server. This happens on a per-frame basis. Our ability to modify the binary protocol server is limited. Under the some common (for us) network conditions, the binary protocol server times us out while waiting on the next chunk of data when a single frame cannot be transferred within the timeout period. To ensure smooth communication, we'd like to be able to limit the maximum frame size sent over the web socket connection. As far as I can tell, there is no way to control the maximum frame size sent by WebSocket clients in the browser.

What solutions exist today?

  1. Implement a custom chunking scheme in user code, on top of the web socket protocol. In general, this is unfortunate because it introduces a need for both the server and client to be aware of this additional layer, and is redundant since the web socket protocol already supports continuation frames for data chunking. In our case, we are unable to modify the server environment to support a custom protocol.
  2. Bypass the web socket protocol and send large payloads via some other mechanism (POST request). This is not a good solution because it requires stepping outside of the normal flow of communication and requires complex user code to correctly interleave the data transfers over the two different pathways. In our case, we are unable to modify the server environment to support this kind of scheme.
  3. (As far as I know, there is no other solution available in the browser)

How would you solve it?

I would like a way to pass maxFrameSize (or maxTextFrameSize and maxBinaryFrameSize) to the WebSocket constructor so that I can guarantee large messages will be split into appropriately sized frames.

Anything else?

In my testing, I discovered that:

  1. Firefox appears to send large binary messages in a single frame if it can, while
  2. Chrome appears to break large binary messages into frames of ~130Kb

It would be nice if there was a standardized way to control this behavior to ensure consistent behavior across different browsers

mccolljr commented 10 months ago

This would probably depend on acceptance and implementation of something like #42

ricea commented 10 months ago

I view message framing as an implementation detail and not something we want to standardise. When Chrome first started breaking messages into multiple frames we had some interoperability issues with servers that couldn't reassemble the messages correctly, but we haven't heard of any problems like that in years.

It simplifies Chrome's implementation that we can put frame boundaries wherever we want, and I assume it simplifies Firefox's implementation that they can send all messages as a single frame. I don't want to impose additional requirements on implementers.

In this case, the solution is for the forwarding server to start forwarding frame data before the whole frame has been received. This is good to avoid excessive memory consumption anyway. If the forwarding server doesn't preserve the original message boundaries then it would also be possible to workaround it by splitting up messages in JavaScript before sending them.

mccolljr commented 9 months ago

I view message framing as an implementation detail and not something we want to standardise. When Chrome first started breaking messages into multiple frames we had some interoperability issues with servers that couldn't reassemble the messages correctly, but we haven't heard of any problems like that in years.

I can understand this - this is the first time I've ever needed to worry about this. In the 99% case, I do not have any need to worry about this.

It simplifies Chrome's implementation that we can put frame boundaries wherever we want, and I assume it simplifies Firefox's implementation that they can send all messages as a single frame. I don't want to impose additional requirements on implementers.

I'm admittedly not familiar with Chrome's web socket implementation, but this surprises me. Would you be willing to elaborate on what complications this would introduce?

In this case, the solution is for the forwarding server to start forwarding frame data before the whole frame has been received. This is good to avoid excessive memory consumption anyway. If the forwarding server doesn't preserve the original message boundaries then it would also be possible to workaround it by splitting up messages in JavaScript before sending them.

I mean, yes, you're absolutely right. Unfortunately, we do not control the server in question. It runs in an embedded environment and was written by another team. That team is the one who would be responsible for changing the behavior, and it would not be able to be deployed to all devices in a timely fashion. Indeed, we are pursuing this avenue internally. In any case, I was surprised by the fact that such a fundamental part of the protocol is not tunable at all, even in a non-portable and browser-specific way. We would have liked to avoid implementing a user-land version of continuation frames when the protocol already supports exactly this functionality, but it seems like we don't have a choice at the moment.