mrniko / netty-socketio

Socket.IO server implemented on Java. Realtime java framework
Apache License 2.0
6.77k stars 1.65k forks source link

Failing websocket upgrades polluting logs #327

Open brockj opened 8 years ago

brockj commented 8 years ago

We have some client connections that are constantly failing to upgrade from polling to websocket transport. This is probably because of an intermediary proxy that we have no control over

With debug enabled we see the upgrade fail 5 seconds after the upgrade request client did not complete upgrade - closing transport

We then see an error from the polling transport as it has been closed is not registered. Closing connection

these messages are polluting our logs and leaving a lot of TIME_WAIT connections around

I am not sure if it matters, but we are running an nginx server in front of our socketio server.

mrniko commented 8 years ago

which version do you use?

brockj commented 8 years ago

I have tried a few versions and am currently on 1.7.9. I have cloned the netty-socketio repo and will add some more debugging to get a better idea if what is going wrong

brockj commented 8 years ago

I haven't been able to debug this very far as it only happens on production systems Here is what i have from enabling trace logging

This connection upgraded ok

20:26:55.507 [] [nioEventLoopGroup-3-1] DEBUG c.c.s.handler.AuthorizeHandler - Handshake authorized for sessionId: f9d44344-2079-4814-8268-269dcfd2e5f4, query params: {EIO=[3], transport=[polling], t=[LFe6qk4]} headers: {Cookie=[Connection=[upgrade], User-Agent=[Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.110 Safari/537.36]}
20:26:55.527 [] [nioEventLoopGroup-3-1] DEBUG c.c.socketio.handler.ClientHead - binding channel: [id: 0x396167cb, L:/127.0.0.1:8082 - R:/127.0.0.1:45556] to transport: POLLING
20:26:55.639 [] [nioEventLoopGroup-3-1] TRACE c.c.socketio.handler.EncoderHandler - Out message: ^@^A^A^C�0{"sid":"f9d44344-2079-4814-8268-269dcfd2e5f4","upgrades":["websocket"],"pingInterval":20000,"pingTimeout":60000} - sessionId: f9d44344-2079-4814-8268-269dcfd2e5f4
20:26:55.832 [] [nioEventLoopGroup-3-2] DEBUG c.c.socketio.handler.ClientHead - binding channel: [id: 0x11657722, L:/127.0.0.1:8082 - R:/127.0.0.1:45584] to transport: POLLING
20:26:55.838 [] [nioEventLoopGroup-3-2] TRACE c.c.socketio.handler.EncoderHandler - Out message: ^@^B�40 - sessionId: f9d44344-2079-4814-8268-269dcfd2e5f4
20:26:55.867 [] [nioEventLoopGroup-3-1] DEBUG i.n.h.c.h.w.WebSocketServerHandshaker - [id: 0xbdc2fd92, L:/127.0.0.1:8082 - R:/127.0.0.1:45586] WebSocket version V13 server handshake
20:26:55.878 [] [nioEventLoopGroup-3-1] DEBUG io.netty.util.internal.Cleaner0 - java.nio.ByteBuffer.cleaner(): available
20:26:55.881 [] [nioEventLoopGroup-3-1] DEBUG i.n.h.c.h.w.WebSocketServerHandshaker - WebSocket version 13 server handshake key: L6DDuqfRbqL0rVgbigxdqA==, response: Rl+2w2t5D41yj9rABtC5hGRX5vo=
20:26:55.908 [] [nioEventLoopGroup-3-1] DEBUG c.c.socketio.handler.ClientHead - binding channel: [id: 0xbdc2fd92, L:/127.0.0.1:8082 - R:/127.0.0.1:45586] to transport: WEBSOCKET
20:26:55.909 [] [nioEventLoopGroup-3-1] DEBUG c.c.s.transport.WebSocketTransport - сlient f9d44344-2079-4814-8268-269dcfd2e5f4 handshake completed
20:26:56.001 [] [nioEventLoopGroup-3-2] DEBUG c.c.socketio.handler.ClientHead - binding channel: [id: 0xd4f04305, L:/127.0.0.1:8082 - R:/127.0.0.1:45590] to transport: POLLING
20:26:56.001 [] [nioEventLoopGroup-3-1] DEBUG i.n.h.c.h.w.WebSocket08FrameDecoder - Decoding WebSocket Frame opCode=1
20:26:56.001 [] [nioEventLoopGroup-3-1] DEBUG i.n.h.c.h.w.WebSocket08FrameDecoder - Decoding WebSocket Frame length=8
20:26:56.004 [] [nioEventLoopGroup-3-1] TRACE c.c.socketio.handler.InPacketHandler - In message: 2probe sessionId: f9d44344-2079-4814-8268-269dcfd2e5f4
20:26:56.005 [] [nioEventLoopGroup-3-1] TRACE c.c.socketio.handler.EncoderHandler - Out message: 3probe sessionId: f9d44344-2079-4814-8268-269dcfd2e5f4
20:26:56.007 [] [nioEventLoopGroup-3-1] DEBUG i.n.h.c.h.w.WebSocket08FrameEncoder - Encoding WebSocket Frame opCode=1 length=8
20:26:56.012 [] [nioEventLoopGroup-3-2] TRACE c.c.socketio.handler.EncoderHandler - Out message: ^@^A�6 - sessionId: f9d44344-2079-4814-8268-269dcfd2e5f4
20:26:56.268 [] [nioEventLoopGroup-3-1] DEBUG i.n.h.c.h.w.WebSocket08FrameDecoder - Decoding WebSocket Frame opCode=1
20:26:56.268 [] [nioEventLoopGroup-3-1] DEBUG i.n.h.c.h.w.WebSocket08FrameDecoder - Decoding WebSocket Frame length=3
20:26:56.269 [] [nioEventLoopGroup-3-1] TRACE c.c.socketio.handler.InPacketHandler - In message: 5 sessionId: f9d44344-2079-4814-8268-269dcfd2e5f4
20:26:56.269 [] [nioEventLoopGroup-3-1] DEBUG c.c.socketio.handler.ClientHead - Transport upgraded to: WEBSOCKET for: f9d44344-2079-4814-8268-269dcfd2e5f4

This connection failed

23:48:29.546 [] [nioEventLoopGroup-3-1] DEBUG c.c.s.handler.AuthorizeHandler - Handshake authorized for sessionId: 6cf0604a-4fbd-44d2-9a91-4a4c3bf12f30, query params: {EIO=[3], transport=[polling], t=[LFeqz7x]} headers: {Cookie=[Connection=[upgrade], User-Agent=[Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0]}
23:48:29.546 [] [nioEventLoopGroup-3-1] DEBUG c.c.socketio.handler.ClientHead - binding channel: [id: 0x67a23cda, L:/127.0.0.1:8082 - R:/127.0.0.1:46228] to transport: POLLING
23:48:29.547 [] [nioEventLoopGroup-3-1] TRACE c.c.socketio.handler.EncoderHandler - Out message: ^@^A^A^C�0{"sid":"6cf0604a-4fbd-44d2-9a91-4a4c3bf12f30","upgrades":["websocket"],"pingInterval":20000,"pingTimeout":60000} - sessionId: 6cf0604a-4fbd-44d2-9a91-4a4c3bf12f30
23:48:29.931 [] [nioEventLoopGroup-3-2] DEBUG c.c.socketio.handler.ClientHead - binding channel: [id: 0x9d47f8eb, L:/127.0.0.1:8082 - R:/127.0.0.1:46238] to transport: POLLING
23:48:29.933 [] [nioEventLoopGroup-3-2] TRACE c.c.socketio.handler.EncoderHandler - Out message: ^@^B�40 - sessionId: 6cf0604a-4fbd-44d2-9a91-4a4c3bf12f30
23:48:30.076 [] [nioEventLoopGroup-3-1] DEBUG c.c.socketio.handler.ClientHead - binding channel: [id: 0x489e1bde, L:/127.0.0.1:8082 - R:/127.0.0.1:46252] to transport: POLLING
23:48:30.088 [] [nioEventLoopGroup-3-2] DEBUG i.n.h.c.h.w.WebSocketServerHandshaker - [id: 0xf5e0bd95, L:/127.0.0.1:8082 - R:/127.0.0.1:46254] WebSocket version V13 server handshake
23:48:30.090 [] [nioEventLoopGroup-3-2] DEBUG i.n.h.c.h.w.WebSocketServerHandshaker - WebSocket version 13 server handshake key: 4ADLFqvphwKWP20WSvwi4g==, response: Wjln+MI6sA8fUr0q+DXxnmspc24=
23:48:30.092 [] [nioEventLoopGroup-3-2] DEBUG c.c.socketio.handler.ClientHead - binding channel: [id: 0xf5e0bd95, L:/127.0.0.1:8082 - R:/127.0.0.1:46254] to transport: WEBSOCKET
23:48:30.092 [] [nioEventLoopGroup-3-2] DEBUG c.c.s.transport.WebSocketTransport - сlient 6cf0604a-4fbd-44d2-9a91-4a4c3bf12f30 handshake completed
23:48:40.107 [] [pool-4-thread-1] DEBUG c.c.s.transport.WebSocketTransport - client did not complete upgrade - closing transport
23:48:40.108 [] [pool-4-thread-1] DEBUG c.c.s.SocketIOChannelInitializer - Client with sessionId: 6cf0604a-4fbd-44d2-9a91-4a4c3bf12f30 disconnected
23:48:40.108 [] [pool-4-thread-1] DEBUG c.c.s.transport.NamespaceClient - Client 6cf0604a-4fbd-44d2-9a91-4a4c3bf12f30 for namespace  has been disconnected

This server is behind an nginx server and should be configured to pass though upgrade requests, as per the nginx configuration guide. Socke.io client is version 1.4.5, server is 1.7.10

The only difference i could see in the requests was that the failing one came from FireFox, so this may be a client side issue

brockj commented 8 years ago

I think i have been able to reproduce the issue locally using two nginx proxy servers

The first proxy is setup to proxy all connections to the second proxy, but will set the connection header to close (to simulate a corporate proxy server). The second server is setup based on the suggested websocket configuration (send Upgrade if set, else close)

chxiaowu commented 8 years ago

@brockj how did you fixed this problem , i am facing same issue on sit. we also using a nginx to pass all the tcp data to socket-io server.

chxiaowu commented 8 years ago

@brockj by the way , i using 1.7.10 and 1.7.11 both have t his issue, from chrome , client js version is 1.4.5.

brockj commented 8 years ago

@chxiaowu I actually didn't fix it, i just disabled websocket transport on the client to force polling. I still get the odd error, but nowhere near as many

chxiaowu commented 8 years ago

@brockj Maybe need some configuration on nginx should be ok ? because all works fine without nginx tcp porxy...

chxiaowu commented 8 years ago

@brockj better websocket,

chxiaowu commented 8 years ago

@mrniko hi brother , hope you can do something for us ...

chxiaowu commented 8 years ago

@brockj ok on my side now , actually it's configruation mistake.

priviously proxy_timeout is just 3 seconds, that's why always disconnect after 3 seconds, now i change to 10 minutes , and in 10 minutes i think should be have heartbeat data at least , can ensure the connect alive. nginx version is large than 1.9.0 which have tcp proxy function.

image

brockj commented 8 years ago

@chxiaowu Good to hear it is working for you What version of nginx are you using exactly? the documentation of the http proxy module doesn't mention a proxy_timeout parameter: http://nginx.org/en/docs/http/ngx_http_proxy_module.html I was setting proxy_read_timeout

are you using this module: http://nginx.org/en/docs/stream/ngx_stream_proxy_module.html

chxiaowu commented 8 years ago

@brockj yes using stream proxy, proxy all tcp data.
image

AntonLiashenka commented 8 years ago

what about apache httpd?

jackmiking commented 7 years ago

you should update your socket io script version to 1.5.0 or higher.download page is here https://github.com/socketio/socket.io-client/releases. you also can use cdn from http://www.bootcdn.cn/socket.io/ .that will fix your problem.

ROBINCHEN8899 commented 6 years ago

@chxiaowu @brockj HI,how the issue fixed at last?

godtrack commented 5 years ago

socket.io-2.2.0/netty-socketio-1.7.17 This problem still exists, even when debugging locally, and there is no nginx agent !!!