SagerNet / sing-box

The universal proxy platform
https://sing-box.sagernet.org/
Other
19.32k stars 2.3k forks source link

Sing-box TPROXY 模式断流 #1620

Closed Rayd62 closed 4 months ago

Rayd62 commented 6 months ago

Operating system

Linux

System version

Client: Linux version 5.15.0-1054-kvm (buildd@bos03-amd64-049) (gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #59-Ubuntu SMP Thu Mar 14 16:03:41 UTC 2024 ---- PVE QEMU VM Server: Linux version 6.1.0-18-amd64 (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01)

Installation type

Original sing-box Command Line

If you are using a graphical client, please provide the version of the client.

No response

Version

sing-box version 1.8.10

Description

客户端使用sing-box Tproxy 模式,使用nft 将流量重定向到tproxy 端口,会发生不定期的断流现象。 服务器端使用sing-box 的hysteria2 协议。 发生故障时,客户端负载并不高,查看PVE 状态和htop 均为发现资源耗尽的情况。

iShot_2024-03-25_23 00 30 iShot_2024-03-25_23 02 51

客户端debug 可以看到如下的一些异常信息

客户端配置如下:

{
  "log": {
    "level": "debug"
  },
  "dns": {
    "servers": [
      {
        "tag": "dns-rfc",
        "address": "tls://1.1.1.1",
        "strategy": "ipv4_only",
        "detour": "proxy-rfc"
      },
      {
        "tag": "dns-cn",
        "address": "223.5.5.5",
        "strategy": "ipv4_only",
        "detour": "direct"
      }
    ],
    "rules": [
      {
        "inbound": "all-in",
        "rule_set": [
          "geosite-steamcn",
          "geosite-apple-cn",
          "category-game-cn",
          "geosite-cn"
        ],
        "server": "dns-cn"
      }
    ],
    "final": "dns-rfc"
  },
  "inbounds": [
    {
      "type": "tproxy",
      "tag": "all-in",
      "listen_port": 12345,
      "udp_fragment": true,
      "sniff": true,
      "sniff_override_destination": true
    }
  ],
  "outbounds": [
    {
      "type": "direct",
      "tag": "direct"
    },
    {
      "type": "dns",
      "tag": "dns-out"
    },
    {
      "type": "block",
      "tag": "block"
    },
    {
      "type": "hysteria2",
      "tag": "proxy-rfc",
      "server": "ip addr",
      "server_port": 443,
      "password": "password-here",
      "tls": {
        "enabled": true,
        "server_name": "server domain",
        "alpn": "h3"
      }
    }
  ],
  "route": {
    "rules": [
      {
        "protocol": "dns",
        "outbound": "dns-out"
      },
      {
        "domain_suffix": [
          "tool.lu",
          "busuu.com",
          "nn.ci"
        ],
        "outbound": "direct"
      },
      {
        "rule_set": [
          "geosite-steamcn",
          "geoip-cn",
          "geosite-cn",
          "geosite-apple-cn",
          "category-game-cn"
        ],
        "outbound": "direct"
      }
    ],
    "rule_set": [
      ....
    ],
    "final": "proxy-rfc",
    "auto_detect_interface": true
  }
}

服务器端配置:

{
  "log": {
    "level": "debug"
  },
  "dns": {
    "servers": [
      {
        "tag": "sg-dns",
        "address": "tls://1.1.1.1",
        "detour": "openai-sg"
      },
      {
        "tag": "hk-dns",
        "address": "tls://8.8.8.8",
        "detour": "direct"
      }
    ],
    "rules": [
      {
        "geosite": "category-ads-all",
        "server": "block",
        "disable_cache": true
      },
      {
        "rule_set": [
          "bm7-reddit",
          "geosite-openai",
          "geosite-openai-ads",
          "geolite-openai"
        ],
        "server": "sg-dns"
      }
    ],
    "final": "hk-dns",
    "strategy": "ipv4_only"
  },
  "inbounds": [
    {
      "type": "hysteria2",
      "tag": "hy2-in",
      "listen": "0.0.0.0",
      "listen_port": 443,
      "sniff": true,
      "up_mbps": 300,
      "down_mbps": 300,
      "users": [
        {
          "name": "tproxy",
          "password": "xxxxxxxxxx"
        }
      ],
      "tls": {
        "enabled": true,
        "server_name": "rfc.xxxxxx.com",
        "alpn": "h3",
        "acme": {
          "domain": "rfc.xxxxxx.com",
          "email": "xxxxxxxxxxx",
          "provider": "letsencrypt",
          "dns01_challenge": {
            "provider": "cloudflare",
            "api_token": "xxxxxxxxxxxxxxxxx"
          }
        }
      },
      "masquerade": "https://www.xxxxxx.com"
    }
  ],
  "outbounds": [
    {
      "type": "direct",
      "tag": "direct"
    },
    {
      "type": "dns",
      "tag": "dns-out"
    },
    {
      "type": "block",
      "tag": "block"
    },
    {
      "type": "shadowsocks",
      "tag": "openai-sg",
      "server": "xxxxxxxxx",
      "server_port": xxxxxx,
      "method": "2022-blake3-aes-128-gcm",
      "password": "xxxxxxxxxxxxx"
    }
  ],
  "route": {
    "rules": [
      {
        "inbound": "hy2-in",
        "protocol": "dns",
        "outbound": "dns-out"
      },
      {
        "rule_set": [
          "bm7-reddit",
          "geosite-openai",
          "geosite-openai-ads",
          "geolite-openai"
        ],
        "outbound": "openai-sg"
      },
      {
        "inbound": "hy2-in",
        "outbound": "direct"
      }
    ],
    "rule_set": [
      ......
    ],
    "final": "direct",
    "auto_detect_interface": true
  }
}

Reproduction

sing-box 使用systemctl start sing-box 启动,该问题会不定期发生,目前观察到1.8.6 - 1.8.10 的版本都发生了该问题。 长则2天,短则20mins,会出现类似的情况。

Logs

// tproxy debug info
DEBUG[0330] [3411073540 3m46s] inbound/tproxy[all-in]: connection closed: download: io: read/write on closed pipe | upload: io: read/write on closed pipe | upstream: context canceled
DEBUG[0330] [1453643736 1m5s] inbound/tproxy[all-in]: connection closed: upload: io: read/write on closed pipe | download: io: read/write on closed pipe | upstream: context canceled
DEBUG[0330] [306938144 2m47s] inbound/tproxy[all-in]: connection closed: upload: io: read/write on closed pipe | download: io: read/write on closed pipe | upstream: context canceled
DEBUG[0330] [1409683211 2m17s] inbound/tproxy[all-in]: connection closed: download: io: read/write on closed pipe | upload: io: read/write on closed pipe | upstream: context canceled
DEBUG[0330] [1543538598 17.43s] inbound/tproxy[all-in]: connection closed: upload: io: read/write on closed pipe | download: io: read/write on closed pipe | upstream: context canceled
DEBUG[0330] [666031937 18.8s] inbound/tproxy[all-in]: connection closed: download: io: read/write on closed pipe | upload: io: read/write on closed pipe | upstream: context canceled

DEBUG[0198] [2959041696 10.0s] inbound/tproxy[all-in]: connection closed: io: read/write on closed pipe | upstream: context canceled
DEBUG[0199] [435782673 3m15s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.115:56540: timeout: no recent network activity
DEBUG[0201] [1403900692 1m16s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.115:56665: timeout: no recent network activity

DEBUG[0287] [1714852836 0ms] router: sniffed protocol: tls, domain: courier.push.apple.com
INFO[0287] [1714852836 0ms] outbound/hysteria2[proxy-rfc]: outbound connection to courier.push.apple.com:5223
DEBUG[0288] [3851985167 21.97s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.115:56755: download: use of closed network connection
DEBUG[0288] [3093823784 27.98s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.115:56748: download: use of closed network connection
DEBUG[0288] [2036208468 23.97s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.115:56752: download: use of closed network connection
DEBUG[0288] [1802159481 25.98s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.115:56749: download: use of closed network connection
DEBUG[0288] [666449522 29.99s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.115:56746: download: use of closed network connection
DEBUG[0292] [3516833811 29.98s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.40:50553: download: use of closed network connection
DEBUG[0292] [3261013672 23.96s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.40:50556: download: use of closed network connection
DEBUG[0292] [2724611467 25.97s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.40:50555: download: use of closed network connection
DEBUG[0292] [856632396 27.97s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.40:50554: download: use of closed network connection
DEBUG[0292] [2376823744 21.95s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.40:50557: download: use of closed network connection

DEBUG[0306] [1168457185 29.0s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.115:56760: upload: raw-read tcp 3.33.235.18:443->10.1.1.115:56760: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [2900390936 1.76s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.7:53378: upload: raw-read tcp 74.86.151.162:443->10.1.1.7:53378: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [3931113787 34.65s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.132:46320: upload: raw-read tcp 199.59.149.202:80->10.1.1.132:46320: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [553591609 31.0s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.115:56759: upload: raw-read tcp 15.197.213.252:443->10.1.1.115:56759: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [870661040 8.75s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.130:49630: upload: raw-read tcp 17.253.87.218:443->10.1.1.130:49630: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [2756306621 2.69s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.40:50563: upload: raw-read tcp 17.253.87.214:443->10.1.1.40:50563: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [2160255988 684ms] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.40:50564: upload: raw-read tcp 17.253.87.217:443->10.1.1.40:50564: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [3920180223 11.13s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.40:50559: upload: raw-read tcp 17.57.145.150:443->10.1.1.40:50559: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [2746983162 3m23s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.115:56661: upload: raw-read tcp 23.106.130.208:28445->10.1.1.115:56661: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [3187942986 7.12s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.40:50561: upload: raw-read tcp 17.57.145.148:443->10.1.1.40:50561: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [1714852836 18.16s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.130:49628: upload: raw-read tcp 17.57.145.137:5223->10.1.1.130:49628: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [3526003134 9.13s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.40:50560: upload: raw-read tcp 17.57.145.149:443->10.1.1.40:50560: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [3961930734 5.12s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.40:50562: upload: raw-read tcp 17.57.145.151:443->10.1.1.40:50562: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [3264410470 1m23s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.132:44012: upload: raw-read tcp 203.208.43.98:80->10.1.1.132:44012: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [2068689692 2m53s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.132:37542: upload: raw-read tcp 64.233.187.188:5228->10.1.1.132:37542: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [2480386817 1m23s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.132:49384: upload: raw-read tcp 199.59.149.202:443->10.1.1.132:49384: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [3484695056 1m49s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.115:56695: upload: raw-read tcp 142.251.43.10:443->10.1.1.115:56695: use of closed network connection | download: timeout: no recent network activity
DEBUG[0306] [21447011 13.14s] inbound/tproxy[all-in]: connection closed: process connection from 10.1.1.40:50558: upload: raw-read tcp 17.57.145.152:443->10.1.1.40:50558: use of closed network connection | download: timeout: no recent network activity

客户端log 太多有删减。

服务器端debug 信息如下:

DEBUG[0000] router: updating rule-set geosite-openai-ads from URL: https://github.com/MetaCubeX/meta-rules-dat/raw/sing/geo/geosite/openai@ads.srs
INFO[0000] router: updated rule-set geosite-openai
DEBUG[0000] router: updating rule-set geolite-openai from URL: https://github.com/MetaCubeX/meta-rules-dat/raw/sing/geo-lite/geosite/openai.srs
INFO[0000] router: updated rule-set geosite-cn
INFO[0000] router: updated rule-set geoip-cn
INFO[0000] router: updated rule-set geosite-category-ads-all
INFO[0001] router: updated rule-set geosite-openai-ads
INFO[0001] router: updated rule-set geolite-openai
INFO[0001] inbound/hysteria2[hy2-in]: udp server started at 0.0.0.0:443
INFO[0001] sing-box started (1.455s)
DEBUG[0080] inbound/hysteria2[hy2-in]: connection failed: receive message: timeout: no recent network activity
INFO[0206] [3951435774 0ms] inbound/hysteria2[hy2-in]: inbound connection from 14.155.201.124:45811
INFO[0206] [3951435774 3ms] inbound/hysteria2[hy2-in]: [tproxy] inbound connection to gsp-ssl.ls.apple.com:443
DEBUG[0206] [3951435774 4ms] router: sniffed protocol: tls, domain: gsp-ssl.ls.apple.com
DEBUG[0206] [3951435774 5ms] router: match[2] inbound=hy2-in => direct
INFO[0206] [3951435774 8ms] outbound/direct[direct]: outbound connection to gsp-ssl.ls.apple.com:443
DEBUG[0206] [3951435774 8ms] dns: lookup domain gsp-ssl.ls.apple.com
INFO[0206] outbound/direct[direct]: outbound connection to 8.8.8.8:853
INFO[0206] [1486264960 0ms] inbound/hysteria2[hy2-in]: inbound connection from 14.155.201.124:45811
INFO[0206] [1486264960 3ms] inbound/hysteria2[hy2-in]: [tproxy] inbound connection to alt4-mtalk.google.com:443
DEBUG[0206] [1486264960 4ms] router: sniffed protocol: tls, domain: alt4-mtalk.google.com
DEBUG[0206] [1486264960 4ms] router: match[2] inbound=hy2-in => direct
INFO[0206] [1486264960 4ms] outbound/direct[direct]: outbound connection to alt4-mtalk.google.com:443
DEBUG[0206] [1486264960 4ms] dns: lookup domain alt4-mtalk.google.com
INFO[0206] [1351995808 0ms] inbound/hysteria2[hy2-in]: inbound connection from 14.155.201.124:45811
INFO[0206] [1351995808 4ms] inbound/hysteria2[hy2-in]: [tproxy] inbound connection to 142.251.43.10:443
DEBUG[0206] [1351995808 5ms] router: sniffed protocol: tls, domain: firestore.googleapis.com
DEBUG[0206] [1351995808 6ms] router: match[2] inbound=hy2-in => direct
INFO[0206] [1351995808 6ms] outbound/direct[direct]: outbound connection to 142.251.43.10:443
INFO[0206] [1450048979 0ms] inbound/hysteria2[hy2-in]: inbound connection from 14.155.201.124:45811
INFO[0206] [1450048979 6ms] inbound/hysteria2[hy2-in]: [tproxy] inbound connection to api.telegram.org:443
DEBUG[0206] [1450048979 7ms] router: sniffed protocol: tls, domain: api.telegram.org
DEBUG[0206] [1450048979 7ms] router: match[2] inbound=hy2-in => direct
INFO[0206] [1450048979 7ms] outbound/direct[direct]: outbound connection to api.telegram.org:443
DEBUG[0206] [1450048979 8ms] dns: lookup domain api.telegram.org

服务器端会看到debug 卡在DEBUG[0080] 非常久,即使客户端显示的正常发包,服务器端也没有看到收包。当服务器debug 信息从DEBUG[0080] 跳到DEBUG[0206] sing-box 恢复正常。(服务器是完整抓包,未删减)

Supporter

Integrity requirements

luckyyyyy commented 6 months ago

The same issue is repeated here in hy2 (tproxy -> hy2)

Rayd62 commented 6 months ago

The same issue is repeated here in hy2 (tproxy -> hy2)

Hi luckyyyyy,

I did some experiments. My Platform is N100 NUC, when I setup CPU mode of VM with x86_64_v2, x86_64_v2_AES and x86_64_v3, the situation has improved. Now I setup CPU mode with x86_64_v3, the frequency of occurrence becomes less frequent and service is restored much faster than before (3 - 5 min without manual intervention). Hope this info can help u get a better experience, even it is not completely resolved.

iShot_2024-04-04_12 46 45
github-actions[bot] commented 4 months ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 5 days