XTLS / Xray-core

Xray, Penetrates Everything. Also the best v2ray-core, with XTLS support. Fully compatible configuration.
https://t.me/projectXray
Mozilla Public License 2.0
23.07k stars 3.69k forks source link

TCP timeout 10 minutes #1671

Closed GektorUA closed 3 months ago

GektorUA commented 1 year ago

I have setup xtls-vision with fallback to nginx (mostly, as example: https://github.com/XTLS/Xray-examples/tree/main/All-in-One-fallbacks-Nginx , but delete unused settings). When listen online https TCP radio directly from site i have disconnects every 600 seconds. Have try different online radios, all same, 600 second and stream stuck. I have try edit level for stream:

"levels": { "0": { "handshake": 5, "connIdle": 300, "uplinkOnly": 2, "downlinkOnly": 5, "statsUserUplink": false, "statsUserDownlink": false, "bufferSize": 512 } }

but nothing helps.

I have found in sources only one 600 seconds timeout, can it be related to my problem?

image

GektorUA commented 1 year ago

Some debug info on disconnection event, from Server:

2023/02/18 21:12:11 [Info] [1339508826] app/proxyman/inbound: connection ends > proxy/vless/inbound: fallback ends > context canceled 2023/02/18 21:12:14 [Info] [489002943] app/proxyman/inbound: connection ends > proxy/vless/inbound: connection ends > context canceled 2023/02/18 21:12:16 [Info] [56190278] app/proxyman/inbound: connection ends > proxy/vless/inbound: connection ends > context canceled 2023/02/18 21:12:24 [Info] [3568841681] app/proxyman/inbound: connection ends > proxy/vless/inbound: connection ends > context canceled 2023/02/18 21:12:31 [Info] [1612909734] proxy/vless/inbound: firstLen = 24 2023/02/18 21:12:31 [Info] [1612909734] proxy/vless/inbound: fallback starts > proxy/vless/encoding: invalid request version 2023/02/18 21:12:31 [Info] [1612909734] proxy/vless/inbound: realName = EDITED 2023/02/18 21:12:31 [Info] [1612909734] proxy/vless/inbound: realAlpn = h2 2023/02/18 21:12:41 [Info] [1612909734] app/proxyman/inbound: connection ends > proxy/vless/inbound: fallback ends > context canceled

Client side: 2023/02/18 21:12:14 [Info] [2784422048] app/proxyman/inbound: connection ends > proxy/socks: connection ends > context canceled 2023/02/18 21:12:28 [Info] [2112564818] app/proxyman/inbound: connection ends > proxy/socks: connection ends > context canceled 2023/02/18 21:13:06 [Info] [3547964952] app/proxyman/inbound: connection ends > proxy/socks: connection ends > context canceled

GektorUA commented 1 year ago

I have setup with V2Ray, and there is no such issue with 600 second timeout. Can somebody explain, why there is issue with Xray?

RPRX commented 1 year ago

这是 XTLS Vision 开了 Splice 的问题,其实我给 @yuhan6665 提过,只是还没修,是时候回顾一下以前的邮件了

Xray-core 会关闭五分钟(默认值,可改)上下行无数据的连接,下行 Splice 后就不知道它的状态了,靠上行流量让 Xray 知道连接还活跃,绝大多数情况下是没问题的,但听歌的时候可能会出现长时间上行不活跃,Xray 就会关掉连接

GektorUA commented 1 year ago

So, it's related only for splice? But in xtls vision we can not turn off splice. How to resolve this issue for vision? TCP traffic is continue download, even without upload, can it be tracked or etc?

cornerot commented 10 months ago

Hello! Any update? Because of this, ssh sessions are terminated after 10 minutes if there is no activity in them

cornerot commented 10 months ago

Can be avoided with ssh keepalive:

Host *
    ServerAliveInterval 60
    ServerAliveCountMax 10
uuonda commented 2 months ago

I'm experiencing this issue on Xray v1.8.13.

I use complex chaining setup and TCP connection over dokodemo gets closed exactly after 600 seconds. None of the timeout options help. XTLS Vision is not in the chain.

2024/05/22 03:31:39 [Info] [5088895089] app/proxyman/inbound: connection ends > proxy/dokodemo: connection ends > context canceled
Connection to 127.0.0.1 closed by remote host.
Transferred: sent 3152, received 3416 bytes, in 599.2 seconds

Not every protocol supports "pings" or keepalive features. Is there a proper fix?

GektorUA commented 1 month ago

It's a long issue and there is no fix for now.

yuhan6665 commented 1 month ago

@uuonda note that anything in the path can cut your TCP connect for inactivity, keepalive is a must

uuonda commented 1 month ago

@yuhan6665 Connection is dropped by Xray itself. Exactly after 600 seconds of inactivity. It's very easy to reproduce with the most basic setup of VLESS-WS and dokodemo.

Also, I tried using standalone tools instead of dokodemo and they hold the connection over Xray inbound without issues.

keepalive is a must

Unfortunately, many protocols don't have such features.

uuonda commented 1 month ago

Would someone please take a look at this? I tried every timeout option I could find and Xray still kills the connection exactly after 600 seconds. Where is this limit coming from?

Here is a sample config which allows to reproduce the issue. No caddy, no nginx, no CDN.

server.json ```json { "log": { "loglevel": "debug" }, "routing": { "rules": [], "domainStrategy": "AsIs" }, "inbounds": [ { "tag": "vless_ws", "listen": "34.171.19.201", "port": 443, "protocol": "vless", "settings": { "clients": [ { "id": "72e23c41-01e0-4b8b-a2e6-2c737aff03e4", "email": "vless@xray" } ], "decryption": "none" }, "streamSettings": { "network": "ws", "security": "tls", "wsSettings": { "path": "/ws" }, "tlsSettings": { "serverName": "hatshop.club", "certificates": [ { "certificateFile": "cert.pem", "keyFile": "cert.key" } ] } } } ], "outbounds": [ { "protocol": "freedom", "tag": "direct", "level": 0 }, { "protocol": "blackhole", "tag": "block" } ], "policy": { "levels": { "0": { "connIdle": 999999 } } } } ```
client.json ```json { "log": { "loglevel": "debug" }, "inbounds": [ { "tag": "port_ssh", "listen": "127.0.0.1", "port": 22, "protocol": "dokodemo-door", "settings": { "address": "127.0.0.1", "port": 22, "timeout": 99999, "network": "tcp" } } ], "outbounds": [ { "tag": "vless_ws", "protocol": "vless", "settings": { "vnext": [ { "address": "34.171.19.201", "port": 443, "users": [ { "id": "72e23c41-01e0-4b8b-a2e6-2c737aff03e4", "encryption": "none" } ] } ] }, "streamSettings": { "network": "ws", "security": "tls", "wsSettings": { "path": "/ws?ed=2048" }, "tlsSettings": { "allowInsecure": false, "serverName": "hatshop.club", "fingerprint": "chrome" } } }, { "tag": "blocked", "protocol": "blackhole", "settings": {} } ], "routing": { "rules": [], "domainStrategy": "AsIs" } } ```
Server log ``` Xray 1.8.13 (Xray, Penetrates Everything.) 3120ca4 (go1.22.3 linux/amd64) A unified platform for anti-censorship. 2024/05/25 05:50:03 [Info] infra/conf/serial: Reading config: server.json 2024/05/25 05:50:03 [Debug] app/log: Logger started 2024/05/25 05:50:03 [Debug] app/proxyman/inbound: creating stream worker on 34.171.19.201:443 2024/05/25 05:50:03 [Info] transport/internet/websocket: listening TCP(for WS) on 34.171.19.201:443 2024/05/25 05:50:03 [Warning] core: Xray 1.8.13 started 2024/05/25 05:50:22 [Info] [3351068346] proxy/vless/inbound: firstLen = 66 2024/05/25 05:50:22 [Info] [3351068346] proxy/vless/inbound: received request for tcp:127.0.0.1:22 2024/05/25 05:50:22 [Info] [3351068346] app/dispatcher: default route for tcp:127.0.0.1:22 2024/05/25 05:50:22 [Info] [3351068346] transport/internet/tcp: dialing TCP to tcp:127.0.0.1:22 2024/05/25 05:50:22 [Debug] transport/internet: dialing to tcp:127.0.0.1:22 2024/05/25 05:50:22 159.201.104.17:37844 accepted tcp:127.0.0.1:22 [vless_ws >> direct] email: vless@xray 2024/05/25 05:50:22 [Info] [3351068346] proxy/freedom: connection opened to tcp:127.0.0.1:22, local endpoint 127.0.0.1:52238, remote endpoint 127.0.0.1:22 2024/05/25 05:50:22 [Info] [3351068346] proxy: CopyRawConn readv 2024/05/26 06:00:22 [Info] [3351068346] app/proxyman/inbound: connection ends > proxy/vless/inbound: connection ends > proxy/vless/inbound: failed to transfer request payload > websocket: close 1000 (normal) ```
Client log ``` Xray 1.8.13 (Xray, Penetrates Everything.) 3120ca4 (go1.22.3 linux/amd64) A unified platform for anti-censorship. 2024/05/25 05:50:15 [Info] infra/conf/serial: Reading config: client.json 2024/05/25 05:50:15 [Debug] app/log: Logger started 2024/05/25 05:50:15 [Debug] app/proxyman/inbound: creating stream worker on 127.0.0.1:22 2024/05/25 05:50:15 [Info] transport/internet/tcp: listening TCP on 127.0.0.1:22 2024/05/25 05:50:15 [Warning] core: Xray 1.8.13 started 2024/05/25 05:50:22 [Debug] [2768448714] proxy/dokodemo: processing connection from: 127.0.0.1:43620 2024/05/25 05:50:22 [Info] [2768448714] proxy/dokodemo: received request for 127.0.0.1:43620 2024/05/25 05:50:22 [Info] [2768448714] app/dispatcher: default route for tcp:127.0.0.1:22 2024/05/25 05:50:22 [Info] [2768448714] transport/internet/websocket: creating connection to tcp:34.171.19.201:443 2024/05/25 05:50:22 [Info] [2768448714] proxy/vless/outbound: tunneling request to tcp:127.0.0.1:22 via 34.171.19.201:443 2024/05/25 05:50:22 127.0.0.1:43620 accepted tcp:127.0.0.1:22 [port_ssh >> vless_ws] 2024/05/25 05:50:22 [Debug] transport/internet: dialing to tcp:34.171.19.201:443 2024/05/25 06:00:22 [Info] [2768448714] app/proxyman/inbound: connection ends > proxy/dokodemo: connection ends > context canceled ```
SSH ``` $ ssh onda@127.0.0.1 -p 22 Last login: Sat May 25 02:50:23 2024 from 127.0.0.1 $ Connection to 127.0.0.1 closed by remote host. Connection to 127.0.0.1 closed. ```
uuonda commented 1 month ago

@RPRX @Fangliding I would appreciate if someone would take a look at this or at least reopen the issue.

2024/05/25 05:50:22 [Info] [3351068346] proxy/vless/inbound: received request for tcp:127.0.0.1:22
...
2024/05/26 06:00:22 [Info] [3351068346] app/proxyman/inbound: connection ends > proxy/vless/inbound: connection ends > proxy/vless/inbound: failed to transfer request payload > websocket: close 1000 (normal)

websocket: close 1000 (normal)

You can see Xray gracefully closes the connection exactly after 10 minutes for reasons unknown.

https://github.com/XTLS/Xray-core/blob/0a3c449cdf16f8dbd1f3823621f2a22806c43677/features/policy/default.go#L19

This one seems unrelated, had no effect when I changed it.

mmmray commented 1 month ago

@uuonda yuhan's argument is that any protocol that does not implement keepalive will not work over a real network anyway, and xray is not trying to make those protocols work.

Unfortunately, many protocols don't have such features.

which protocols are that? do they work on any network at all? it seems to me those protocols would be inherently broken on anything other than loopback. note that a solution for SSH keepalive has already been posted.

and which network do you use to test your xray config? the networks that I know of kill idle TCP-without-keepalive much earlier than 10 minutes anyway.

uuonda commented 1 month ago

@mmmray I'm fairly certain this is an Xray issue. I can run a keepalive enabled wireguard tunnel over VLESS if you insist. Any dokodemo connection over that tunnel would still be killed by Xray exactly after 10 minutes of inactivity. A tunnel itself would not log any disconnects.

This is a "no data for 10 minutes" timeout. TCP level keepalive has nothing to do with this. SSH has that enabled by default. My connections over the same network to the same server run indefinitely without any extra options. Just not over Xray though.

uuonda commented 1 month ago

Could anyone reopen this, please? Or should I open a new issue? I've posted the most basic setup configs that allow to reproduce this.

Fangliding commented 1 month ago

I think it is already clear from the previous discussion that we cannot keep an inactive connection indefinitely and must have a timeout mechanism

uuonda commented 1 month ago

What about configuring this timeout? There is connIdle and dokodemo timeout but they have no effect on this. Is not that a bug?

mmmray commented 2 weeks ago

This is a "no data for 10 minutes" timeout. TCP level keepalive has nothing to do with this. SSH has that enabled by default. My connections over the same network to the same server run indefinitely without any extra options. Just not over Xray though.

I think you are missing the point of what I'm saying entirely. I am not talking about TCP keepalive either, I am talking about app-level keepalive. I expect most protocols to have this, and so far nobody mentioned a protocol that doesn't (including SSH). The fix for your "most basic setup" was posted here

uuonda commented 2 weeks ago

@mmmray

I am talking about app-level keepalive

So am I. Where is a keepalive option in telnet?