sockjs / sockjs-client

WebSocket emulation - Javascript client
MIT License
8.4k stars 1.3k forks source link

"Received a broken close frame containing a reserved status code" on secure web socket connection #122

Closed icelava closed 10 years ago

icelava commented 11 years ago

We have a SockJs client app that connects to a SockJs Node server (hosted in Windows Azure). Recently we completed the update on the server side to support SSL. So our client now connects with to a URL like https://socketservice.cloudapp.net.

For most part that works totally fine. But every now and then, we see Google Chrome lose connectivity, something we've never seen before when using http

WebSocket connection to 'wss://socketservice.cloudapp.net/socket/650/pfz10b/websocket' failed: Received a broken close frame containing a reserved status code.

The problem will persist for a period of time, and then disappear all by itself; the client can then connect and work again.

That particular error message seems to be a very specific webkit error message, and pages discussing that involve webkit source code that is beyond my understanding. It seems that Chrome got a close frame from the server which is not of an expected close status code. However I still do not understand enough of the Websocket protocol to figure out who initiated the close, why there is even a close (when they are suppose to maintain a connection), and which software layer needs troubleshooting.

The server side Node implementation certainly does not actively close any connections. It just listens on client connection, data, and close events.

Would like to know where we should focus our further troubleshooting effort to understand the root cause of this. thanks.

mrjoes commented 11 years ago

Did you try terminating SSL before sockjs-node server? For example, at load balancer (haproxy/nginx with websocket support)?

Also, can you see what was the closing frame code by capturing some traffic?

P.S. If it is Windows Azure, is it Windows host, not Linux host? I think there was load balancer from Microsoft that supports SSL termination and websockets.

Serge.

On Wed, Jun 19, 2013 at 10:00 AM, Aaron notifications@github.com wrote:

We have a SockJs client app that connects to a SockJs Node server (hosted in Windows Azure). Recently we completed the update on the server side to support SSL. So our client now connects with to a URL like https://socketservice.cloudapp.net.

For most part that works totally fine. But every now and then, we see Google Chrome lose connectivity, something we've never seen before when using http

WebSocket connection to 'wss:// socketservice.cloudapp.net/socket/650/pfz10b/websocket' failed: Received a broken close frame containing a reserved status code.

The problem will persist for a period of time, and then disappear all by itself; the client can then connect and work again.

That particular error message seems to be a very specific webkit error message, and pages discussing that involve webkit source code that is beyond my understanding. It seems that Chrome got a close frame from the server which is not of an expected close status code. However I still do not understand enough of the Websocket protocol to figure out who initiated the close, why there is even a close (when they are suppose to maintain a connection), and which software layer needs troubleshooting.

The server side Node implementation certainly does not actively close any connections. It just listens on client connection, data, and close events.

Would like to know where we should focus our further troubleshooting effort to understand the root cause of this. thanks.

— Reply to this email directly or view it on GitHubhttps://github.com/sockjs/sockjs-client/issues/122 .

icelava commented 11 years ago

Hi, for Windows Azure Node deployments, SSL is handled directly at the virtual server level and not load balancer (fabric controller). We configure a tcp endpoint for port 443, and the load balancer lets it flow through to the servers, where the regular Node https configuration comes into play.

http://www.windowsazure.com/en-us/develop/nodejs/common-tasks/enable-ssl-worker-role/

icelava commented 11 years ago

Given that we are not entirely clear of the source of the problem, we have for now updated the server side SockJs to v0.3.7 given that there appears to be a number of connection-related fixes. A brief guess is it may not fix our exact problem but we will see if it happens again going forward.

icelava commented 11 years ago

Happened for a short while again last night. Not something "fixed" by SockJs Node v0.3.7.

WebSocket connection to 'wss://socket.cloudapp.net/socket/401/p4ol3qcc/websocket' failed: Received a broken close frame containing a reserved status code. http://localhost:83/ event callbacks is function () { [native code] } WebSocket.js:115 [globalstate] websocket => disconnected

majek commented 11 years ago

Isn't it something related to azure? Please try to understand what "reserved status code" is being sent. Without this I can't do anything.

icelava commented 11 years ago

That may be possible, but I wouldn't conclude it's a Windows Azure thing without substantial findings, which is the problem here - I am not knowledgeable of the path to take to obtain the valuable data that can point to the true cause and a solution.

As a user of the browser, what can one do to when this phenomenon happens again to gather the needed clues?

From the Node server end, what can one do as well? Are there additional objects or properties that track and give insight on abnormal client disconnections?

It would be nice to get Wireshark on standby to make peeks when it happens, but I think https makes it difficult to troubleshoot.

majek commented 11 years ago

From the Node server end, what can one do as well? Are there additional objects or properties that track and give insight on abnormal client disconnections?

Exactly. Maybe you can do a packet capture and decode it from wireshark?

icelava commented 11 years ago

Apparently Wireshark has SSL decryption capability. Wasn't aware of that and will have to learn to set it up to monitor https traffic, and see what we can gather next time it happens.

icelava commented 10 years ago

It has happened again, and this time round managed to decrypt some of the traffic. It appears SockJs/NodeJs is sending invalid 1006 status in a close frame, as complained by the client browser.

As such, we can mark this as not a client-side issue at all, and concentrate on the server side of things. Thanks

No. Time Source Destination Protocol Length Info 4196 09:44:32.315020000 server client WebSocket 128 WebSocket Connection Close [FIN]

Frame 4196: 128 bytes on wire (1024 bits), 128 bytes captured (1024 bits) on interface 0 Ethernet II, Src: Olicom_ce:bd:29 (00:00:24:ce:bd:29), Dst: Dell_26:ba:2f (18:03:73:26:ba:2f) Internet Protocol Version 4, Src: server, Dst: client Version: 4 Header length: 20 bytes Differentiated Services Field: 0x10 (DSCP 0x04: Unknown DSCP; ECN: 0x00: Not-ECT (Not ECN-Capable Transport)) Total Length: 114 Identification: 0x8094 (32916) Flags: 0x02 (Don't Fragment) Fragment offset: 0 Time to live: 113 Protocol: TCP (6) Header checksum: 0x87d2 [correct] [Good: True] [Bad: False] Source: server Destination: client [Source GeoIP: Unknown] [Destination GeoIP: Unknown] Transmission Control Protocol, Src Port: https (443), Dst Port: secure-cfg-svr (3978), Seq: 8996, Ack: 2654, Len: 74 Source port: https (443) Destination port: secure-cfg-svr (3978) [Stream index: 58] Sequence number: 8996 (relative sequence number) [Next sequence number: 9070 (relative sequence number)] Acknowledgment number: 2654 (relative ack number) Header length: 20 bytes Flags: 0x018 (PSH, ACK)

  1. .... .... = Reserved: Not set ...0 .... .... = Nonce: Not set .... 0... .... = Congestion Window Reduced (CWR): Not set .... .0.. .... = ECN-Echo: Not set .... ..0. .... = Urgent: Not set .... ...1 .... = Acknowledgment: Set .... .... 1... = Push: Set .... .... .0.. = Reset: Not set .... .... ..0. = Syn: Not set .... .... ...0 = Fin: Not set Window size value: 509 [Calculated window size: 130304] [Window size scaling factor: 256] Checksum: 0xbd53 [validation disabled] [SEQ/ACK analysis] Secure Sockets Layer TLSv1 Record Layer: Application Data Protocol: http Content Type: Application Data (23) Version: TLS 1.0 (0x0301) Length: 32 Encrypted Application Data: 707635d116d723fdd0fd9f7c3c48c17595001fa4ca1a842e... TLSv1 Record Layer: Application Data Protocol: http Content Type: Application Data (23) Version: TLS 1.0 (0x0301) Length: 32 Encrypted Application Data: e9787b195d4ab444c6051139741ca6d1e2bb0d245ddbcd9d... WebSocket 1... .... = Fin: True .000 .... = Reserved: 0x00 .... 1000 = Opcode: Connection Close (8) 0... .... = Mask: False .000 0010 = Payload length: 2 Payload Close: 03ee Close: Abnormal Closure (1006)