tg123 / websockify-nginx-module

Embed websockify into Nginx (convert any tcp connection into websocket)
MIT License
141 stars 60 forks source link

websockify 模块主动断开同前后端的连接 #17

Closed vincenthcui closed 5 years ago

vincenthcui commented 5 years ago

通过 websockfiy 模块连上后端机器,一段时间后会出现连接断开的问题。我们在nginx中进行 TCP 抓包,发现是nginx主动发送FIN包,断开了同前后端的连接,但是断开连接的原因尚不清楚。

以下是抓包的流量图,09:17:03 发送 FIN 包结束两端的连接。 10.0.x.x 是 NGINX 主机,106.52.x.x 是后端 vnc 服务器,172.16.x.x 是前端过来的流量

image

我们尝试过在upstream块中打开 nginx 的 keep-alive 配置,但是对这个情况没有影响。

tg123 commented 5 years ago

请贴下nginx log debug level

vincenthcui commented 5 years ago

@tg123 重新测了一下,有一条 upstream time out 的异常,需要的话邮件发全量日志给你

2019/07/16 10:39:05 [debug] 6#6: *1 event timer del: 12: 1563248345053
2019/07/16 10:39:05 [debug] 6#6: *1 http upstream request: "/websockify?ip=139.199.x.x&port=31266"
2019/07/16 10:39:05 [debug] 6#6: *1 http upstream process upgraded, fu:1
2019/07/16 10:39:05 [info] 6#6: *1 upstream timed out (110: Operation timed out) while proxying upgraded connection, client: 172.16.16.15, server: x.x.com, request: "GET /websockify?ip=139.199.x.x&port=31266 HTTP/1.1", upstream: "websockify://139.199.x.x", host: "vnc-ttlab.cloud.tencent.com"
2019/07/16 10:39:05 [debug] 6#6: *1 finalize http upstream request: 504
2019/07/16 10:39:05 [debug] 6#6: *1 free rr peer 1 0
2019/07/16 10:39:05 [debug] 6#6: *1 close http upstream connection: 12
2019/07/16 10:39:05 [debug] 6#6: *1 free: 00007FC88B757740, unused: 48
2019/07/16 10:39:05 [debug] 6#6: *1 reusable connection: 0
2019/07/16 10:39:05 [debug] 6#6: *1 http output filter "/websockify?ip=139.199.x.x&port=31266"
2019/07/16 10:39:05 [debug] 6#6: *1 http copy filter: "/websockify?ip=139.199.x.x&port=31266"
2019/07/16 10:39:05 [debug] 6#6: *1 lua capture body filter, uri "/websockify"
2019/07/16 10:39:05 [debug] 6#6: *1 http postpone filter "/websockify?ip=139.199.x.x&port=31266" 00007FFFB23125E0
2019/07/16 10:39:05 [debug] 6#6: *1 write new buf t:0 f:0 0000000000000000, pos 0000000000000000, size: 0 file: 0, size: 0
2019/07/16 10:39:05 [debug] 6#6: *1 http write filter: l:0 f:1 s:0
2019/07/16 10:39:05 [debug] 6#6: *1 http copy filter: 0 "/websockify?ip=139.199.x.x&port=31266"
2019/07/16 10:39:05 [debug] 6#6: *1 http finalize request: 0, "/websockify?ip=139.199.x.x&port=31266" a:1, c:1
2019/07/16 10:39:05 [debug] 6#6: *1 http request count:1 blk:0
2019/07/16 10:39:05 [debug] 6#6: *1 http close request
2019/07/16 10:39:05 [debug] 6#6: *1 lua request cleanup: forcible=0
2019/07/16 10:39:05 [debug] 6#6: *1 http log handler
2019/07/16 10:39:05 [debug] 6#6: *1 free: 00007FC88B7D07E0
2019/07/16 10:39:05 [debug] 6#6: *1 free: 00007FC88B7C07C0
2019/07/16 10:39:05 [debug] 6#6: *1 free: 00007FC88B7B07A0
2019/07/16 10:39:05 [debug] 6#6: *1 free: 00007FC88B7A0780
2019/07/16 10:39:05 [debug] 6#6: *1 free: 00007FC88B739DC0, unused: 2
2019/07/16 10:39:05 [debug] 6#6: *1 free: 00007FC88B72EFE0, unused: 32
2019/07/16 10:39:05 [debug] 6#6: *1 free: 00007FC88B732020, unused: 3111
2019/07/16 10:39:05 [debug] 6#6: *1 close http connection: 3
2019/07/16 10:39:05 [debug] 6#6: *1 reusable connection: 0
2019/07/16 10:39:05 [debug] 6#6: *1 free: 00007FC88B730000
2019/07/16 10:39:05 [debug] 6#6: *1 free: 00007FC88B729FC0
2019/07/16 10:39:05 [debug] 6#6: *1 free: 00007FC88B756380, unused: 24
vincenthcui commented 5 years ago

确认了下是与后端连接超时,这里的心跳超过一分钟,导致 nginx 主动断开了 upstream 和 downstream 的连接。通过更改 websockify_read_timeoutwebsockify_connect_timeoutwebsockify_send_timeout 三个值可以控制。

tg123 commented 5 years ago

如果是 upstream timeout 可以试着调节下 timeout 参数 看你日志你这个是走在公网上么?

如果这样的话 断开问题 应该不是module 引起的

vincenthcui commented 5 years ago

noVNC 也没有关于心跳的说明,但是默认参数还是容易导致断开的问题,把 timeout 的默认值提高到 3min 会不会好一点?大部分的vnc桌面会显示系统时间,时间的变化可以避免了连接断开的问题。

vincenthcui commented 5 years ago

是在公网上,确实不是 module 引起的,但是我觉得 timeout 60s 的默认值太过保守,容易导致这类问题。_

tg123 commented 5 years ago

这个module 用在 aliyun vnc 3分钟 个人感觉过长 容易堆积 失败的连接 特别是大量用户 如果 upstream 真的down 的话

理论上 如果一直有动作 比如 晃动鼠标 光标闪烁 是不会有断开发生的

随时 reopen 如果希望继续讨论