fatedier / frp

A fast reverse proxy to help you expose a local server behind a NAT or firewall to the internet.
Apache License 2.0
85.34k stars 13.24k forks source link

frpc断开后一直重连失败 #3042

Closed AinzLimuru closed 2 years ago

AinzLimuru commented 2 years ago

Bug Description

frps位于电信公网环境,frpc位于移动内网环境。用于映射qbittorrent的做种端口,当连接数达到150左右时会发生卡顿。然后会出现无法连接frpc的服务器的问题。frps的dashboard显示连接正常,但是连接数归零。frpc显示重连时。

frpc Version

0.44.0

frps Version

0.44.0

System Architecture

Linux(Ubuntu Desktop 22.04)/amd64

Configurations

[common] server_addr = 域名 server_port = 端口 token = token

[qbittorrent_tcp_port] type = tcp local_ip = 127.0.0.1 local_port = 12824 remote_port = 12824

其余的端口配置与qbittorrent_tcp_port相同仅修改端口

Logs

以下为frpc日志

先是大量出现这种日志 8月 01 13:58:29 server frpc[31157]: 2022/08/01 13:58:29 [E] [control.go:159] [6690e36d41d7fa64] work connection closed before response StartWorkConn message: EOF 8月 01 13:58:29 server frpc[31157]: 2022/08/01 13:58:29 [E] [control.go:159] [6690e36d41d7fa64] work connection closed before response StartWorkConn message: EOF 8月 01 13:58:29 server frpc[31157]: 2022/08/01 13:58:29 [E] [control.go:159] [6690e36d41d7fa64] work connection closed before response StartWorkConn message: EOF 8月 01 13:58:29 server frpc[31157]: 2022/08/01 13:58:29 [E] [control.go:159] [6690e36d41d7fa64] work connection closed before response StartWorkConn message: EOF 8月 01 13:58:29 server frpc[31157]: 2022/08/01 13:58:29 [E] [control.go:159] [6690e36d41d7fa64] work connection closed before response StartWorkConn message: EOF 8月 01 13:58:29 server frpc[31157]: 2022/08/01 13:58:29 [E] [control.go:159] [6690e36d41d7fa64] work connection closed before response StartWorkConn message: EOF 8月 01 13:58:29 server frpc[31157]: 2022/08/01 13:58:29 [E] [control.go:159] [6690e36d41d7fa64] work connection closed before response StartWorkConn message: EOF 8月 01 13:58:29 server frpc[31157]: 2022/08/01 13:58:29 [E] [control.go:159] [6690e36d41d7fa64] work connection closed before response StartWorkConn message: EOF

然后开始大量出现这种日志,上面的日志不再出现 8月 01 13:12:07 server frpc[31157]: 2022/08/01 13:12:07 [W] [service.go:204] [6690e36d41d7fa64] reconnect to server error: i/o deadline reached, wait 20s for another retry 8月 01 13:12:26 server frpc[31157]: 2022/08/01 13:12:26 [I] [service.go:201] [6690e36d41d7fa64] try to reconnect to server... 8月 01 13:12:36 server frpc[31157]: 2022/08/01 13:12:36 [W] [service.go:204] [6690e36d41d7fa64] reconnect to server error: i/o deadline reached, wait 20s for another retry 8月 01 13:12:57 server frpc[31157]: 2022/08/01 13:12:57 [I] [service.go:201] [6690e36d41d7fa64] try to reconnect to server...

以下为frps日志

8月 01 13:40:00 client frps[328105]: 2022/08/01 13:40:00 [I] [control.go:307] [a1783cc9cbcb07da] control writer is closing 8月 01 13:40:00 client frps[328105]: 2022/08/01 13:40:00 [I] [control.go:396] [a1783cc9cbcb07da] client exit success 以上两条日志在出现在每次发生重连问题之前,然后开始出现重连失败的问题。后续重连的日志中不再产生该日志。

8月 01 13:40:01 client frps[328105]: 2022/08/01 13:40:01 [I] [proxy.go:179] [6690e36d41d7fa64] [qbittorrent_tcp_port] get a user connection [...:] 8月 01 13:40:01 client frps[328105]: 2022/08/01 13:40:01 [I] [service.go:450] [dda4ef8704458f6a] client login info: ip [...:] version [0.38.0] hostname [] os [linux] arch > ........ 8月 01 13:43:11 client frps[328105]: 2022/08/01 13:43:11 [I] [service.go:450] [6690e36d41d7fa64] client login info: ip [...:***] version [0.44.0] hostname [] os [linux] arch> 8月 01 13:43:11 client frps[328105]: 2022/08/01 13:43:11 [I] [control.go:281] [6690e36d41d7fa64] Replaced by client [6690e36d41d7fa64]

期间依然有大量的get a user connection的日志输出,端口是qb的做种端口

Steps to reproduce

  1. 按照配置正常完成qbittorrent端口的穿透
  2. iptables和socat配合进行种子的tracker的转发(不然向流量不会经过frps的服务器,上报的地址为frpc服务器的地址,其他的bt客户端就不会尝试连接frps对应的端口而是直接连接frpc的服务器。这里比较麻烦,复现难度较大)

可能的诱发原因是qbittorrent穿透的tcp端口的大量传入连接(但也才200左右)

1134 和这个几乎一模一样

运行frpc的服务器存在两个frpc的service,指向同一个frps。一个是只负责ssh,一个是负责除ssh之外的内容。出现问题的是第二个。当发生这个问题是,ssh对应的frpc完全不受影响。

恢复的过程是重启frps。无需重启frpc且重启frpc无效。

Affected area

AinzLimuru commented 2 years ago

个人感觉像是连接突然断开,然后frpc尝试重连的过程中,frps的数据没有完全更新(猜测),因此导致登陆超时。因为在重启frps之后服务立刻恢复正常。frpc所处的非公网移动网络出口ip可能会不定时发生变化。

fatedier commented 2 years ago

i/o deadline reached 这个错误日志表示是网络问题,和 frps 没有关系,只和当时的网络/连接有关联。

linux 下可以用 ss -antp |grep {frps ip} 来看一下当时的网络链接状态。

以及进行 tcpdump 抓包,看为什么会 io 超时。

AinzLimuru commented 2 years ago

个人感觉不是网络问题,因为frpc所在的服务器用基本相同的systemd的方式运行了两个frpc,正如上面提到的一样。那么这两个frpc所处的网络环境应该是几乎一致的。但是却出现了ssh对应的frpc的连接非常稳定,但是另一个frpc的连接经常断开且无法重连的问题。如果真的是网络问题的话,应该另一个frpc也出现问题,但是这样的问题一次都没有发生过。

AinzLimuru commented 2 years ago

之前有排查到过是openfile设置的问题,但是显示的的是too many open file的报错和这个错误应该没有关系,然后现在已经把open file 的约束调整到65535了,系统,用户,进程三个角度都确认过已经修改成功了。现在确实不再显示too many open file这个报错了。 不知道是不是这个问题的一些遗留问题。

AinzLimuru commented 2 years ago

目前重启后暂时工作正常,崩溃后我会尝试通过您提到的方法进行抓包检查

AinzLimuru commented 2 years ago

tcpdump抓自frpc服务器 脱敏处理,9.9.9.9代表frps服务器, 192.168.0.2代表frpc服务器 16:14:15.741757 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [.], ack 1888, win 2323, options [nop,nop,TS val 2701991619 ecr 2778801468], length 0 16:14:15.742843 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [.], ack 1900, win 2323, options [nop,nop,TS val 2701991620 ecr 2778801469], length 0 16:14:15.742843 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [.], ack 2012, win 2323, options [nop,nop,TS val 2701991621 ecr 2778801469], length 0 16:14:15.742844 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [.], ack 2048, win 2323, options [nop,nop,TS val 2701991621 ecr 2778801470], length 0 16:14:15.743823 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [.], ack 2172, win 2323, options [nop,nop,TS val 2701991622 ecr 2778801471], length 0 16:14:15.743824 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [.], ack 2296, win 2323, options [nop,nop,TS val 2701991623 ecr 2778801472], length 0 16:14:15.744826 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [P.], seq 1:13, ack 2296, win 2323, options [nop,nop,TS val 2701991624 ecr 2778801472], length 12 16:14:15.744826 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [P.], seq 13:65, ack 2296, win 2323, options [nop,nop,TS val 2701991624 ecr 2778801472], length 52 16:14:15.744886 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [.], ack 65, win 6266, options [nop,nop,TS val 2778801543 ecr 2701991624], length 0 16:14:15.746823 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [P.], seq 65:77, ack 2296, win 2323, options [nop,nop,TS val 2701991627 ecr 2778801472], length 12 16:14:15.746823 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [P.], seq 77:129, ack 2296, win 2323, options [nop,nop,TS val 2701991627 ecr 2778801472], length 52 16:14:15.746872 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [.], ack 129, win 6266, options [nop,nop,TS val 2778801545 ecr 2701991627], length 0 16:14:15.747086 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 2296:2308, ack 129, win 6266, options [nop,nop,TS val 2778801545 ecr 2701991627], length 12 16:14:15.747098 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 2308:2360, ack 129, win 6266, options [nop,nop,TS val 2778801545 ecr 2701991627], length 52 16:14:15.769083 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 2360:2372, ack 129, win 6266, options [nop,nop,TS val 2778801567 ecr 2701991627], length 12 16:14:15.769097 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 2372:2600, ack 129, win 6266, options [nop,nop,TS val 2778801567 ecr 2701991627], length 228 16:14:15.769183 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 2600:2612, ack 129, win 6266, options [nop,nop,TS val 2778801567 ecr 2701991627], length 12 16:14:15.769198 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 2612:2776, ack 129, win 6266, options [nop,nop,TS val 2778801567 ecr 2701991627], length 164 16:14:15.769275 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 2776:2788, ack 129, win 6266, options [nop,nop,TS val 2778801567 ecr 2701991627], length 12 16:14:15.769288 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 2788:2968, ack 129, win 6266, options [nop,nop,TS val 2778801567 ecr 2701991627], length 180 16:14:15.769368 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 2968:2980, ack 129, win 6266, options [nop,nop,TS val 2778801567 ecr 2701991627], length 12 16:14:15.769382 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 2980:3112, ack 129, win 6266, options [nop,nop,TS val 2778801567 ecr 2701991627], length 132 16:14:15.769447 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3112:3124, ack 129, win 6266, options [nop,nop,TS val 2778801567 ecr 2701991627], length 12 16:14:15.769460 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3124:3272, ack 129, win 6266, options [nop,nop,TS val 2778801567 ecr 2701991627], length 148 16:14:15.769503 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3272:3284, ack 129, win 6266, options [nop,nop,TS val 2778801567 ecr 2701991627], length 12 16:14:15.769514 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3284:3368, ack 129, win 6266, options [nop,nop,TS val 2778801567 ecr 2701991627], length 84 16:14:15.817024 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [.], ack 2308, win 2323, options [nop,nop,TS val 2701991696 ecr 2778801545], length 0 16:14:15.817025 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [.], ack 2360, win 2323, options [nop,nop,TS val 2701991696 ecr 2778801545], length 0 16:14:15.818800 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [P.], seq 129:141, ack 2360, win 2323, options [nop,nop,TS val 2701991698 ecr 2778801545], length 12 16:14:15.818801 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [P.], seq 141:245, ack 2360, win 2323, options [nop,nop,TS val 2701991698 ecr 2778801545], length 104 16:14:15.818861 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [.], ack 245, win 6266, options [nop,nop,TS val 2778801617 ecr 2701991698], length 0 16:14:15.819857 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3368:3380, ack 245, win 6266, options [nop,nop,TS val 2778801618 ecr 2701991698], length 12 16:14:15.819870 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3380:3432, ack 245, win 6266, options [nop,nop,TS val 2778801618 ecr 2701991698], length 52 16:14:15.821843 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3432:3444, ack 245, win 6266, options [nop,nop,TS val 2778801620 ecr 2701991698], length 12 16:14:15.821857 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3444:3496, ack 245, win 6266, options [nop,nop,TS val 2778801620 ecr 2701991698], length 52 16:14:15.823115 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3496:3508, ack 245, win 6266, options [nop,nop,TS val 2778801621 ecr 2701991698], length 12 16:14:15.823128 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3508:3576, ack 245, win 6266, options [nop,nop,TS val 2778801621 ecr 2701991698], length 68 16:14:15.823208 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3576:3588, ack 245, win 6266, options [nop,nop,TS val 2778801621 ecr 2701991698], length 12 16:14:15.823221 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3588:3640, ack 245, win 6266, options [nop,nop,TS val 2778801621 ecr 2701991698], length 52 16:14:15.824565 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3640:3652, ack 245, win 6266, options [nop,nop,TS val 2778801622 ecr 2701991698], length 12 16:14:15.824580 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3652:3704, ack 245, win 6266, options [nop,nop,TS val 2778801622 ecr 2701991698], length 52 16:14:15.824716 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3704:3716, ack 245, win 6266, options [nop,nop,TS val 2778801622 ecr 2701991698], length 12 16:14:15.824729 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3716:3752, ack 245, win 6266, options [nop,nop,TS val 2778801622 ecr 2701991698], length 36 16:14:15.825594 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3752:3764, ack 245, win 6266, options [nop,nop,TS val 2778801623 ecr 2701991698], length 12 16:14:15.825608 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3764:3864, ack 245, win 6266, options [nop,nop,TS val 2778801623 ecr 2701991698], length 100 16:14:15.825645 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3864:3876, ack 245, win 6266, options [nop,nop,TS val 2778801623 ecr 2701991698], length 12 16:14:15.825656 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3876:3912, ack 245, win 6266, options [nop,nop,TS val 2778801623 ecr 2701991698], length 36 16:14:15.827541 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3912:3924, ack 245, win 6266, options [nop,nop,TS val 2778801625 ecr 2701991698], length 12 16:14:15.827555 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3924:3976, ack 245, win 6266, options [nop,nop,TS val 2778801625 ecr 2701991698], length 52 16:14:15.827665 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3976:3988, ack 245, win 6266, options [nop,nop,TS val 2778801625 ecr 2701991698], length 12 16:14:15.827678 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 3988:4024, ack 245, win 6266, options [nop,nop,TS val 2778801625 ecr 2701991698], length 36 16:14:15.838782 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [.], ack 2372, win 2323, options [nop,nop,TS val 2701991718 ecr 2778801567], length 0 16:14:15.838783 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [.], ack 2788, win 2323, options [nop,nop,TS val 2701991718 ecr 2778801567], length 0 16:14:15.838784 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [.], ack 3272, win 2323, options [nop,nop,TS val 2701991719 ecr 2778801567], length 0 16:14:15.839780 IP 9.9.9.9.10086 > 192.168.0.2.49562: Flags [.], ack 3368, win 2323, options [nop,nop,TS val 2701991719 ecr 2778801567], length 0 16:14:15.873082 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 4024:4036, ack 245, win 6266, options [nop,nop,TS val 2778801671 ecr 2701991719], length 12 16:14:15.873095 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 4036:4200, ack 245, win 6266, options [nop,nop,TS val 2778801671 ecr 2701991719], length 164 16:14:15.873159 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 4200:4212, ack 245, win 6266, options [nop,nop,TS val 2778801671 ecr 2701991719], length 12 16:14:15.873171 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 4212:4360, ack 245, win 6266, options [nop,nop,TS val 2778801671 ecr 2701991719], length 148 16:14:15.873273 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 4360:4372, ack 245, win 6266, options [nop,nop,TS val 2778801671 ecr 2701991719], length 12 16:14:15.873286 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 4372:4504, ack 245, win 6266, options [nop,nop,TS val 2778801671 ecr 2701991719], length 132 16:14:15.873375 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 4504:4516, ack 245, win 6266, options [nop,nop,TS val 2778801671 ecr 2701991719], length 12 16:14:15.873390 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 4516:4680, ack 245, win 6266, options [nop,nop,TS val 2778801671 ecr 2701991719], length 164 16:14:15.873431 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 4680:4692, ack 245, win 6266, options [nop,nop,TS val 2778801671 ecr 2701991719], length 12 16:14:15.873442 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 4692:4824, ack 245, win 6266, options [nop,nop,TS val 2778801671 ecr 2701991719], length 132 16:14:15.873535 IP 192.168.0.2.49562 > 9.9.9.9.10086: Flags [P.], seq 4824:4836, ack 245, win 6266, options [nop,nop,TS val 2778801671 ecr 2701991719], length 12

ss -antp |grep {frps ip} ESTAB 0 0 192.168.0.2:49640 9.9.9.9:10086
TIME-WAIT 0 0 192.168.0.2:49636 9.9.9.9:10086
TIME-WAIT 0 0 192.168.0.2:49638 9.9.9.9:10086
ESTAB 0 64 192.168.0.2:49562 9.9.9.9:10086

AinzLimuru commented 2 years ago

从结果来看并没有发现超时的原因

AinzLimuru commented 2 years ago

dashboard 显示online frps可以看到重连的日志 8月 01 16:23:07 server frps[410136]: 2022/08/01 16:23:07 [I] [service.go:450] [758ef0a41bd5261f] client login info: ip [...:47918] version [0.44.0] hostname [] os [linux] arch> 8月 01 16:23:07 server frps[410136]: 2022/08/01 16:23:07 [I] [control.go:281] [758ef0a41bd5261f] Replaced by client [758ef0a41bd5261f]

replace前后的ID是相同的,这种情况下会不会有一些数据没有清理?

AinzLimuru commented 2 years ago

等待了十几分钟,frpc尝试重连的二十余次都是报错。然后重启frps,在几秒之内回复了正常。

fatedier commented 2 years ago

本地没法复现这个问题,如果你那边可以稳定复现的话,可以考虑在 frps 的代码里加一些日志调试一下看看。

AinzLimuru commented 2 years ago

复现倒是没有问题,平均一个小时出现一次,我要怎么操作?

AinzLimuru commented 2 years ago

复现的方法找到了,大量做种可以立即复现这个问题,无需等待。 请问要怎么调试?

fatedier commented 2 years ago

修改 frps 代码,增加日志,看 login 时 block 在哪里。

github-actions[bot] commented 2 years ago

Issues go stale after 30d of inactivity. Stale issues rot after an additional 7d of inactivity and eventually close.

DaPoHou commented 2 weeks ago

之前稳定运行几年了,我这两天也突然遇到同样问题,版本是0.36.2 重启frps后即恢复正常。过几个小时又会出现同样问题。 Frpc.log: frpc.log

frps.log: frps.log