[app] optimize query - Githubissues

deepflowio / deepflow-app

GNU Affero General Public License v3.0

11 stars 9 forks source link

Closed taloric closed 2 months ago

taloric commented 3 months ago

optimize query in trace_l7_flows

worker_numbers 修改为配置，避免在核心数 < 10 的机器上浪费调度能力
移除 _id to _id_str 逻辑
减少 construct_from_dataframe 调用，将 tcp_seq/syscall/x_request_id 的查询条件构造通过 dataframe column list 获取（理论上比逐行迭代快，但数据量少可能不明显）

	Total Request Sent	Request/second	Avg Resp Time
before(v6.5.8)	395	1.29	2,625ms
after - 2	477	1.56	2,028ms

再增加 3 修改后的测试结果：（吞吐差异不明显，但平均时延有提升）		Total Request Sent	Request/second	Avg Resp Time
after - 3	478	1.56	1,985 ms

另一个测试集：spans=79，迭代次数 23

	Total Request Sent	Request/second	Avg Resp Time
before(v6.5.8)	75	0.24	17,792ms
after - 2&3	133	0.43	10,006 ms

taloric commented 3 months ago

对于 dataframe 读数据，小数据量可能差异不是特别明显，写了个简单测试： https://gist.github.com/taloric/916309861e18e97945fe554f15639523

分别用三种方法读取 dataframe 列数据：

在 100000 样本量下三种方法的测试结果是：

1. 6.277893781661987s
2. 0.38400983810424805s
3. 0.035607337951660156s

taloric commented 3 months ago

将 _id =xx or _id =xxx... 条件修改为 _id IN (xx)，分组条件为 _id >> 32 得到的秒时间戳对 spans=18 数据集，差异不太明显

	Total Request Sent	Request/second	Avg Resp Time
after - 4	474	1.55	2,040 ms

对 spans=79 数据集，也不太明显

	Total Request Sent	Request/second	Avg Resp Time
after - 4	97	0.32	13,908 ms

taloric commented 3 months ago

另外这里 _id IN (xxx,yyy) 实测了下，没支持转成 _id 所在的秒进行查询（_id=xxx 可以）

数据集：`spans=79`		Total Request Sent	Request/second	Avg Resp Time
after - 5	99	0.32	13,661

taloric commented 2 months ago

移除 auto_instance_0_node_type auto_instance_0_icon_id auto_instance_1_node_type auto_instance_1_icon_id 无论是代码逻辑还是实际返回都没有用到