bifromqio / bifromq

A Multi-Tenancy MQTT broker adopting Serverless architecture
https://bifromq.io
Apache License 2.0
619 stars 61 forks source link

集群cpu很高 #87

Closed 844028312 closed 4 months ago

844028312 commented 4 months ago

⚠️ Please ensure that the provided information is as detailed and clear as possible. Lack of information may delay the resolution of the issue.

Describe the bug cpu很高,客户端pub消息发不出去 看了都是topic-matcher的线程占用高 image 堆栈信息 ![Uploading image.png…]()

BifroMQ

844028312 commented 4 months ago

线程堆栈信息:image

844028312 commented 4 months ago

日志报错信息: image

844028312 commented 4 months ago

请问哪里可以加微信群吗,官网的二维码过期了

popduke commented 4 months ago

你给的信息有限没法复现和判断你遇到的问题,请按ISSUE_TEMPLATE的要求提供详细信息。

请问哪里可以加微信群吗,官网的二维码过期了

README下可以找到入群的联系邮箱

844028312 commented 4 months ago

你给的信息有限没法复现和判断你遇到的问题,请按ISSUE_TEMPLATE的要求提供详细信息。

请问哪里可以加微信群吗,官网的二维码过期了

README下可以找到入群的联系邮箱

一开始是pub了60000条消息,qos=1,topic=test1/r1,然后就出现cpu很高的情况了 现在重启后,发消息是别的topic是可以的,但是发topic=test1/r1的还是不行,客户端会卡死

844028312 commented 4 months ago

Describe the bug A clear and concise description of what the bug is.

BifroMQ

Version: 3.0.2 Deployment: Standalone To Reproduce

Steps to reproduce the behavior, Please include necessary information such as(but not limited to):

PUB Client :

MQTT Connection: 1 ClientIdentifier: mqttx_753092d2 etc... MQTT Pub: Topic: test1/r1 QoS: 1 Retain: false SUB Client : MQTT Connection: 1 Clean Session: true ClientIdentifier: mqttx_753092d3 etc... MQTT Sub: TopicFilter: test1/r1 QoS: 1 Expected behavior A clear and concise description of what you expected to happen.

Logs If applicable, add related logs to help troubleshoot.

Configurations You can copy from the beginning of info.log and paste here. See also: https://bifromq.io/docs/admin_guide/configuration/configs_print

OS(please complete the following information):

OS: CentOS 7 Kernel Version [e.g. 5.6] Kernel Specific Settings: [e.g. TCP, FD, etc] JVM: -Xms5g -Xmx5g -XX:MetaspaceSize=500m -XX:MaxMetaspaceSize=500m -XX:MaxDirectMemorySize=500m -server -XX:MaxInlineLevel=15

Version: 17 Arguments: [e.g. if override any JVM arguments] Performance Related

If your problem is performance-related, please provide as much detailed information as possible according to the list.

HOST: Cluster node count: 2 CPU: 13 Memory: 23G Network: Bandwidth: [e.g. 1Gbps] Latency: [e.g. 1ms] Load: PUB count: 60000 SUB count: PUB QPS per connection: [e.g. 10msg/s] SUB QPS per connection: [e.g. 10msg/s] Payload size: [e.g. 1KB] FanIn & FanOut FanIn: [e.g. 5 means one sub client recieves messages from average 5 pub clients] FanOut: [e.g. 5 means one message is subscribed by average 5 sub clients] Please describe here how do you design your topic pattern and pub/sub messages: [e.g. 10 pub clients send messages to the topic: tp/{deviceName}/event, which each client use its unique deviceName. One sub client subscribe the topicfilter tp/+/event to recieve all the messages.] Additional context Add any other context about the problem here.

popduke commented 4 months ago

60000个 Pub client 向同一个topic: "test1/r1" 上发?然后订阅端直接订阅"test1/r1"?

844028312 commented 4 months ago

60000个 Pub client 向同一个topic: "test1/r1" 上发?然后订阅端直接订阅"test1/r1"?

一个pub发了6w个topic: "test1/r1"的消息,也是一个订阅端订阅

popduke commented 4 months ago

单连接每秒限制200条消息最多可调到1000,你调整到了多少?另外消息体多大?

844028312 commented 4 months ago

单连接每秒限制200条消息最多可调到1000,你调整到了多少?另外消息体多大?

这个是什么参数?这个没有调整,消息文本为:你好,尊敬的客户的撒大苏打实打实大苏打实打实的打法沙发沙发沙发沙发发发生飞洒发顺丰

844028312 commented 4 months ago

现在重启了2个节点还是可以复现,客户端发了test1/r1这个topic就会卡死 每多发一次就会多一个这样的线程,占用cpu 2097127cc0ea8ad5d124dfa184944d9

popduke commented 4 months ago

不确定你是怎么进入这个状态的,按设计只会match一次。你提供下你两个节点的配置信息,启动的时候在Info.log上方有输出。 另外,你使用的是什么测试工具?我尝试用发压工具没出现你说的问题 emqtt_bench pub -h <NODE1> -c 1 -I 5 -i 1 -t test1/r1 -s 32 -q 1 emqtt_bench sub -h <NODE1> -c 1 -i 1 -t test1/r1 -q 1

或者你测试中所有用到的Topic形式也可以提供下