actiontech / sqle

一个支持多种不同类型数据库,覆盖事前控制、事后监督、标准发布场景,帮助您建立质量规范的SQL全生命周期质量管理平台
Mozilla Public License 2.0
1.43k stars 185 forks source link

优化scannerd采集慢日志文件后SQL上传流程 #1597

Closed ColdWaterLW closed 1 year ago

ColdWaterLW commented 1 year ago

问题现象

scannerd抓取慢日志文件SQL后,会把所有原始SQL推送给sql,没有进行聚合。导致sqle在将这批sql insert到元数据表时可能会出现很长的一条sql,不合理

复现

  1. 创建一个慢日志扫描任务
  2. 启动scannerd扫描一个很大的慢日志文件(其中有一些重复的或者类似的SQL)
  3. 查看sqle插入慢日志的sql

解决方案

慢日志扫描时,scannerd在把sql上传之前先对sql进行指纹聚合

关联issue:#1588

taolx0 commented 1 year ago

自测

time="2023-06-21T17:50:37+08:00" level=info msg="sql/Users/taolongxiang/GolandProjects/sqle-ee/sqle/model/audit_plan.go:1646.098458msINSERT INTO `audit_plan_sqls_v2` (`audit_plan_id`,`fingerprint_md5`, `fingerprint`, `sql_content`, `info`, `schema`) VALUES (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?), (?, ?, ?, ?, ?, ?)\nON DUPLICATE KEY UPDATE sql_content = VALUES(sql_content),\n                        info        = JSON_SET(COALESCE(info, '{}'),\n                                              '$.counter', CAST(COALESCE(JSON_EXTRACT(values(info), '$.counter'), 0) +\n                                                                COALESCE(JSON_EXTRACT(info, '$.counter'), 0) AS SIGNED),\n                                              '$.last_receive_timestamp',\n                                              JSON_EXTRACT(values(info), '$.last_receive_timestamp'));[4 59a74d08d407b5edf9a57dd5a41825ca select sleep(?) /* ApplicationName=DataGrip 2023.1.2 */ SELECT SLEEP(1) {\"AuditSchema\":\"\",\"counter\":5,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 ed3fa92e9ae487a10046b4ae7ddbf338 update sqle.test inner join (select ? as id, sleep(?)) a on test.id = a.id set test.name=?, test.age = ?, test.id = ? where name = ?; mysqld, version: ? (mysql community server - gpl). started with: /* ApplicationName=DataGrip 2023.1.2 */ UPDATE sqle.test INNER JOIN (SELECT '2' AS id, SLEEP(11)) a ON test.id = a.id\nSET test.name='xiaowang',\n    test.age = 18,\n    test.id  = 2\nWHERE name = 'test';\n/usr/local/mysql/bin/mysqld, Version: 8.0.31 (MySQL Community Server - GPL). started with: {\"AuditSchema\":\"\",\"counter\":1,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 1c8805b1775b47c6ef3da21d83544b57 set session long_query_time = ? /* ApplicationName=DataGrip 2023.1.2 */ SET SESSION long_query_time = 0 {\"AuditSchema\":\"\",\"counter\":1,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 a86446692de3ca32943798a09019b333 show warnings SHOW WARNINGS {\"AuditSchema\":\"\",\"counter\":133,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 a11944c87a6a5c16fb38455bf7035609 select database() /* ApplicationName=DataGrip 2023.1.2 */ select database() {\"AuditSchema\":\"\",\"counter\":45,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 7a2b64842dcede58c32efd50d9433a78 select @@session.transaction_isolation SELECT @@session.transaction_isolation {\"AuditSchema\":\"\",\"counter\":44,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 df911f0151f9ef021d410b4be5060972 ping Ping {\"AuditSchema\":\"\",\"counter\":46,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 cfa134076e32c4608d274f9d0d156d97 set session transaction read write /* ApplicationName=DataGrip 2023.1.2 */ set session transaction read write {\"AuditSchema\":\"\",\"counter\":22,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 83c1e5898a01e697921459bf4f1b47b0 select @@session.transaction_read_only SELECT @@session.transaction_read_only {\"AuditSchema\":\"\",\"counter\":23,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 e16545bb40117fc1f9f96162cee7817f set net_write_timeout=? /* ApplicationName=DataGrip 2023.1.2 */ SET net_write_timeout=600 {\"AuditSchema\":\"\",\"counter\":22,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 e77769c62ef669aa7dd5f6760f2d2ebb show variables like ? /* ApplicationName=DataGrip 2023.1.2 */ SHOW VARIABLES LIKE 'long_query_time' {\"AuditSchema\":\"\",\"counter\":4,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 7f34381df64db82966c00be150974b83 set sql_select_limit=? /* ApplicationName=DataGrip 2023.1.2 */ SET SQL_SELECT_LIMIT=501 {\"AuditSchema\":\"\",\"counter\":11,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 21fb19005f6951c443b4355f3fd7ada6 set sql_select_limit=default /* ApplicationName=DataGrip 2023.1.2 */ SET SQL_SELECT_LIMIT=DEFAULT {\"AuditSchema\":\"\",\"counter\":11,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 9f120e3b77c5c10838d00041e68cedb0 select sql_text, db, time_to_sec(query_time) as query_time from mysql.slow_log where sql_text != ? and db not in(?+) /* ApplicationName=DataGrip 2023.1.2 */ SELECT sql_text, db, TIME_TO_SEC(query_time) AS query_time\nFROM mysql.slow_log\nWHERE sql_text != ''\n  AND db NOT IN ('information_schema', 'performance_schema', 'mysql', 'sys') {\"AuditSchema\":\"\",\"counter\":9,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 cf75afdbb4a38bb343cd556f6a29a982 /* ApplicationName=DataGrip 2023.1.2 */ /* ApplicationName=DataGrip 2023.1.2 */ {\"AuditSchema\":\"\",\"counter\":1,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 2f592ede0677e8ac6ebfa9541f199ca5 select sql_text, db, time_to_sec(query_time) as query_time from mysql.slow_log where db not in(?+) /* ApplicationName=DataGrip 2023.1.2 */ SELECT sql_text, db, TIME_TO_SEC(query_time) AS query_time\nFROM mysql.slow_log\nWHERE\n#     sql_text != ''\n   db NOT IN ('information_schema', 'performance_schema', 'mysql', 'sys') {\"AuditSchema\":\"\",\"counter\":1,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 491a04b3f75d443615a4e6d2a1290516 set global slow_query_log = ? /* ApplicationName=DataGrip 2023.1.2 */ SET GLOBAL slow_query_log = 1 {\"AuditSchema\":\"\",\"counter\":1,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"}  4 86e9aefdbd80548ff25a91d191f11b73 set global log_output = ? /* ApplicationName=DataGrip 2023.1.2 */ SET GLOBAL log_output = 'FILE,TABLE' {\"AuditSchema\":\"\",\"counter\":3,\"last_receive_timestamp\":\"2023-06-21T17:49:43+08:00\"} ] 34" type=sql
Marcus9530 commented 1 year ago

复现版本: Server Version: release-2.2305.x-ee 8abf46130b 复现步骤: 1.添加一个慢日志扫描任务 2.开启mysql的慢日志功能 3.然后等待采集慢sql 4.查看sqle的日志,发现有条sql统计为1,即把所有的sql未聚合 image

Marcus9530 commented 1 year ago

验证版本: Server Version: main-ee 35e1c6cd86 验证步骤: 1.添加一个慢日志扫描任务 2.开启mysql的慢日志功能 3.然后等待采集慢sql 4.查看sqle的日志,当前统计不是在1,复合预期。 image