MetaCubeX / mihomo

A simple Python Pydantic model for Honkai: Star Rail parsed data from the Mihomo API.
https://wiki.metacubex.one
MIT License
13.78k stars 2.39k forks source link

[Bug] reality + H2 内存泄露,进程频繁重启 #540

Open lvii opened 1 year ago

lvii commented 1 year ago

Verify steps

Clash version

Clash Meta v1.14.4 linux arm64 with go1.20.3 Fri Apr 28 07:14:13 UTC 2023 Use tags: with_gvisor

What OS are you seeing the problem on?

Linux

Clash config

- name: "real-h2"
  type: vless
  server: IP
  port: 443
  uuid: UUID
  tls: true
  udp: true
  network: h2
  flow: ''
  servername: office.com
  reality-opts:
    public-key: ...
    short-id: ...
  client-fingerprint: chrome

Clash log

No response

Description

openwrt shellclash 使用的 Clash.Meta 核心

内存占用超过到 16M 进程就重启,第一列显示的被重启后新的 PID :

# echo "$(pgrep -af clash) $(awk -F ':' '/VmRSS/{print $NF}' /proc/$(pgrep -f clash)/status)"
28744 /etc/clash/clash -d /etc/clash       65876 kB
28744 /etc/clash/clash -d /etc/clash       66092 kB
28744 /etc/clash/clash -d /etc/clash       73160 kB
28744 /etc/clash/clash -d /etc/clash      107504 kB

1256 /etc/clash/clash -d /etc/clash       105444 kB
1256 /etc/clash/clash -d /etc/clash       105708 kB

6787 /etc/clash/clash -d /etc/clash       148324 kB
6787 /etc/clash/clash -d /etc/clash       149320 kB

9520 /etc/clash/clash -d /etc/clash        58068 kB
9520 /etc/clash/clash -d /etc/clash        64844 kB
9520 /etc/clash/clash -d /etc/clash        75072 kB
9520 /etc/clash/clash -d /etc/clash        80712 kB
9520 /etc/clash/clash -d /etc/clash        88723 kB
9520 /etc/clash/clash -d /etc/clash        99888 kB
9520 /etc/clash/clash -d /etc/clash       104820 kB
9520 /etc/clash/clash -d /etc/clash       112552 kB
9520 /etc/clash/clash -d /etc/clash       139896 kB
9520 /etc/clash/clash -d /etc/clash       151112 kB
9520 /etc/clash/clash -d /etc/clash       157544 kB

15320 /etc/clash/clash -d /etc/clash      106660 kB

更换为 flow: xtls-rprx-vision 内存占用就比较稳定,基本很少会超过 10M 进程也不会频繁重启:

# echo "$(pgrep -af 'd /etc/clash') $(awk -F ':' '/VmRSS/{print $NF}' /proc/$(pgrep -f 'd /etc/clash')/status)"
25741 /etc/clash/clash -d /etc/clash       82792 kB
xishang0128 commented 1 year ago

@lvii 内存问题请查看指定issues提供heap

xishang0128 commented 1 year ago

@xishang0128 置顶

lvii commented 1 year ago

Hi @xishang0128 这是 clash debug 启用到进程内存占用太高重启之前,收集的 2 个文件:

heap-pid-13976-6M.txt heap-pid-15126-13M.txt

这是对应进程内存占用到进程重启:

# echo "$(pgrep -af 'd /etc/clash') $(awk -F ':' '/VmRSS/{print $NF}' /proc/$(pgrep -f 'd /etc/clash')/status) $(date +'%F %T')"
13976 /etc/clash/clash -d /etc/clash       67740 kB 2023-05-01 13:00:27  <-- 此处收集 1
13976 /etc/clash/clash -d /etc/clash       68424 kB 2023-05-01 13:01:04
13976 /etc/clash/clash -d /etc/clash       70596 kB 2023-05-01 13:11:38

15126 /etc/clash/clash -d /etc/clash      137048 kB 2023-05-01 13:15:04  <-- 此处收集 2
15126 /etc/clash/clash -d /etc/clash      139116 kB 2023-05-01 13:15:09
15126 /etc/clash/clash -d /etc/clash      146092 kB 2023-05-01 13:15:24
15126 /etc/clash/clash -d /etc/clash      151012 kB 2023-05-01 13:15:26

进程自动重启
# echo "$(pgrep -af 'd /etc/clash') $(awk -F ':' '/VmRSS/{print $NF}' /proc/$(pgrep -f 'd /etc/clash')/status) $(date +'%F %T')"
15214 /etc/clash/clash -d /etc/clash       60492 kB 2023-05-01 13:15:39
lvii commented 1 year ago
# echo "$(pgrep -af 'd /etc/clash') $(awk -F ':' '/VmRSS/{print $NF}' /proc/$(pgrep -f 'd /etc/clash')/status) $(date +'%F %T')"
15736 /etc/clash/clash -d /etc/clash       64032 kB 2023-05-01 13:23:03
15736 /etc/clash/clash -d /etc/clash       91792 kB 2023-05-01 13:25:30
15736 /etc/clash/clash -d /etc/clash       92044 kB 2023-05-01 13:25:34
15736 /etc/clash/clash -d /etc/clash       93460 kB 2023-05-01 13:26:11
15736 /etc/clash/clash -d /etc/clash       93948 kB 2023-05-01 13:26:46
15736 /etc/clash/clash -d /etc/clash       95208 kB 2023-05-01 13:27:45
15736 /etc/clash/clash -d /etc/clash      115060 kB 2023-05-01 13:28:16
15736 /etc/clash/clash -d /etc/clash      116876 kB 2023-05-01 13:28:18
15736 /etc/clash/clash -d /etc/clash      123368 kB 2023-05-01 13:28:22
15736 /etc/clash/clash -d /etc/clash      128812 kB 2023-05-01 13:28:32
15736 /etc/clash/clash -d /etc/clash      139884 kB 2023-05-01 13:29:29
15736 /etc/clash/clash -d /etc/clash      140124 kB 2023-05-01 13:29:35
15736 /etc/clash/clash -d /etc/clash      140636 kB 2023-05-01 13:29:43
15736 /etc/clash/clash -d /etc/clash      140820 kB 2023-05-01 13:29:46
15736 /etc/clash/clash -d /etc/clash      149900 kB 2023-05-01 13:30:22
15736 /etc/clash/clash -d /etc/clash      151452 kB 2023-05-01 13:30:25
15736 /etc/clash/clash -d /etc/clash      152604 kB 2023-05-01 13:30:37
15736 /etc/clash/clash -d /etc/clash      153080 kB 2023-05-01 13:30:45
15736 /etc/clash/clash -d /etc/clash      155608 kB 2023-05-01 13:30:56

这是 15736 进程在 9M 和 15M 内存占用的日志收集:

heap-pid-15736-9M.txt

heap-pid-15736-15M.txt

abcfy2 commented 1 year ago

我最近也遇到了一个频繁重启的问题,但不确定是否是内存泄漏引起的,我开启DEBUG日志级别在终端观察到的现象如下

首先是不知何故突然间DNS查询开始出现大量的upstream: <nil>的错误:

WARN[2023-05-05T11:13:05.41022272+08:00] [TCP] dial Proxy (match RuleSet/telegramcidr) 192.168.50.196:39576 --> 91.108.56.133:443 error: timeout: no recent network activity 
DEBU[2023-05-05T11:13:05.41117538+08:00] [DNS] resolve tr.fengyu.io from tls://dns.rubyfish.cn:853 
DEBU[2023-05-05T11:13:05.41177234+08:00] [DNS] resolve tr.fengyu.io from https://doh.pub:443/dns-query 
DEBU[2023-05-05T11:13:05.41277586+08:00] [DNS] resolve tr.fengyu.io from tls://dns.rubyfish.cn:853 
DEBU[2023-05-05T11:13:05.41341926+08:00] [DNS] resolve tr.fengyu.io from https://223.5.5.5:443/dns-query 
DEBU[2023-05-05T11:13:05.41586398+08:00] [DNS] resolve tr.fengyu.io from https://223.5.5.5:443/dns-query 
DEBU[2023-05-05T11:13:05.4166044+08:00] [DNS] resolve tr.fengyu.io from https://doh.pub:443/dns-query 
WARN[2023-05-05T11:13:05.70935766+08:00] [TCP] dial DIRECT [2409:8a70:968:7670:bede:5d92:9482:9e1b]:39890 --> edge.microsoft.com:443 error: connect failed: dial tcp [2620:1ec:12::239]:443: i/o timeout | dial tcp [2620:1ec:c11::239]:443: i/o timeout | connect failed: dial tcp 13.107.21.239:443: i/o timeout | dial tcp 204.79.197.239:443: i/o timeout 
DEBU[2023-05-05T11:13:05.75894104+08:00] [Sniffer] Sniff TCP [[2409:8a70:968:7670:bede:5d92:9482:9e1b]:49852]-->[edge.microsoft.com:443] success, replace domain [edge.microsoft.com]-->[edge.microsoft.com] 
DEBU[2023-05-05T11:13:05.75919214+08:00] [Rule] use default rules                     
DEBU[2023-05-05T11:13:07.61357268+08:00] [Sniffer] All sniffing sniff failed with from [192.168.50.196:46420] to [91.108.56.133:443] 
DEBU[2023-05-05T11:13:07.6138493+08:00] [Rule] use default rules                     
DEBU[2023-05-05T11:13:07.61458168+08:00] [Rule] use default rules                     
INFO[2023-05-05T11:13:07.6170668+08:00] [TCP] 192.168.50.196:39730 --> 192.168.50.1:18017 doesn't match any rule using DIRECT 
DEBU[2023-05-05T11:13:07.61824392+08:00] [Sniffer] All sniffing sniff failed with from [2409:8a70:968:7670:bede:5d92:9482:9e1b:57472] to [2001:b28:f23f:f005::a:443] 
DEBU[2023-05-05T11:13:07.61846222+08:00] [Rule] use default rules                     
DEBU[2023-05-05T11:13:07.62168632+08:00] [Sniffer] All sniffing sniff failed with from [2409:8a70:968:7670:bede:5d92:9482:9e1b:40348] to [2001:b28:f23f:f005::a:80] 
DEBU[2023-05-05T11:13:07.62228692+08:00] [Rule] use default rules                     
DEBU[2023-05-05T11:13:07.6288835+08:00] [Rule] use default rules                     
INFO[2023-05-05T11:13:07.63392398+08:00] [TCP] 192.168.50.196:39736 --> 192.168.50.1:18017 doesn't match any rule using DIRECT 
DEBU[2023-05-05T11:13:10.3960446+08:00] re-creating the http client due to requesting https://223.5.5.5:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAAAEAAQ: Get "https://223.5.5.5:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAABwAAQ": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) 
DEBU[2023-05-05T11:13:10.39625128+08:00] [https://223.5.5.5:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAAAEAAQ] using HTTP/2 for this upstream: <nil> 
DEBU[2023-05-05T11:13:10.39645716+08:00] re-creating the http client due to requesting https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAAAEAAQ: Get "https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAABwAAQ": context deadline exceeded 
DEBU[2023-05-05T11:13:10.3965687+08:00] [https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAAAEAAQ] using HTTP/2 for this upstream: <nil> 
DEBU[2023-05-05T11:13:10.39660668+08:00] re-creating the http client due to requesting https://223.5.5.5:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAABwAAQ: Get "https://223.5.5.5:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAABwAAQ": context deadline exceeded 
DEBU[2023-05-05T11:13:10.39672666+08:00] [https://223.5.5.5:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAABwAAQ] using HTTP/2 for this upstream: <nil> 
DEBU[2023-05-05T11:13:10.39712778+08:00] re-creating the http client due to requesting https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAABwAAQ: Get "https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAABwAAQ": context deadline exceeded 
DEBU[2023-05-05T11:13:10.39725966+08:00] [https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAABwAAQ] using HTTP/2 for this upstream: <nil> 
DEBU[2023-05-05T11:13:10.3975183+08:00] re-creating the http client due to requesting https://223.5.5.5:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAABwAAQ: Get "https://223.5.5.5:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAAAEAAQ": context deadline exceeded 
DEBU[2023-05-05T11:13:10.39762944+08:00] [https://223.5.5.5:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAABwAAQ] using HTTP/2 for this upstream: <nil> 
DEBU[2023-05-05T11:13:10.39964352+08:00] [Rule] use default rules                     
DEBU[2023-05-05T11:13:10.39856598+08:00] re-creating the http client due to requesting https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAABwAAQ: Get "https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAAAEAAQ": context deadline exceeded 
DEBU[2023-05-05T11:13:10.40091954+08:00] [https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAABwAAQ] using HTTP/2 for this upstream: <nil> 
INFO[2023-05-05T11:13:10.40025826+08:00] [UDP] [2409:8a70:968:7670:bede:5d92:9482:9e1b]:47287 --> [2607:f298:5:101d:f816:3eff:fefd:8817]:123 doesn't match any rule using DIRECT 
DEBU[2023-05-05T11:13:10.40027938+08:00] re-creating the http client due to requesting https://223.5.5.5:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAAAEAAQ: Get "https://223.5.5.5:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAAAEAAQ": context deadline exceeded 
DEBU[2023-05-05T11:13:10.40176792+08:00] re-creating the http client due to requesting https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAAAEAAQ: Get "https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAAAEAAQ": context deadline exceeded 
DEBU[2023-05-05T11:13:10.40318562+08:00] [https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAAAEAAQ] using HTTP/2 for this upstream: <nil> 
DEBU[2023-05-05T11:13:10.40363486+08:00] [https://223.5.5.5:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAAAEAAQ] using HTTP/2 for this upstream: <nil> 
DEBU[2023-05-05T11:13:10.41200568+08:00] re-creating the http client due to requesting https://doh.pub:443/dns-query?dns=AAABAAABAAAAAAAAAnRyBmZlbmd5dQJpbwAAAQAB: Get "https://doh.pub:443/dns-query?dns=AAABAAABAAAAAAAAAnRyBmZlbmd5dQJpbwAAHAAB": context deadline exceeded 
DEBU[2023-05-05T11:13:10.41222148+08:00] [https://doh.pub:443/dns-query?dns=AAABAAABAAAAAAAAAnRyBmZlbmd5dQJpbwAAAQAB] using HTTP/2 for this upstream: <nil> 
...

当这种错误滚了好几屏之后,进程最终就被内核干掉了:

DEBU[2023-05-05T11:13:42.75035632+08:00] [https://223.5.5.5:443/dns-query?dns=AAABAAABAAAAAAAAA2RvaANwdWIAABwAAQ] using HTTP/2 for this upstream: <nil> 
DEBU[2023-05-05T11:13:42.75198238+08:00] re-creating the http client due to requesting https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAAnRyBmZlbmd5dQJpbwAAHAAB: Get "https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAAnRyBmZlbmd5dQJpbwAAAQAB": connect failed: dial tcp 1.0.0.1:443: connect: network is unreachable | connect failed: no ip address 
DEBU[2023-05-05T11:13:42.75248982+08:00] [https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAAnRyBmZlbmd5dQJpbwAAHAAB] using HTTP/2 for this upstream: <nil> 
DEBU[2023-05-05T11:13:42.75316748+08:00] re-creating the http client due to requesting https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAAnRyBmZlbmd5dQJpbwAAHAAB: Get "https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAAnRyBmZlbmd5dQJpbwAAHAAB": connect failed: dial tcp 1.0.0.1:443: connect: network is unreachable | connect failed: no ip address 
DEBU[2023-05-05T11:13:42.7535688+08:00] [https://1.0.0.1:443/dns-query?dns=AAABAAABAAAAAAAAAnRyBmZlbmd5dQJpbwAAHAAB] using HTTP/2 for this upstream: <nil> 

Killed
qist commented 1 year ago

Clash.Meta 不支持real-h2健康检测。建议换成real-grpc