MySQL performance datastream make use of the following query.
SELECT digest_text, count_star, avg_timer_wait, max_timer_wait, last_seen, quantile_95
FROM performance_schema.events_statements_summary_by_digest
ORDER BY avg_timer_wait DESC
LIMIT 10
As part of the testing, it is noticed that when the same SQL statements are run across multiple database schema, especially when the database traffic is less, will lead to duplicate values for digest_text .
events_statements_summary_by_digest has SCHEMA_NAME and DIGEST columns. Each row summarizes events per schema and digest value. (The DIGEST_TEXT column contains the corresponding normalized statement digest text, but is neither a grouping nor a summary column.
After TSDB enablement, this would lead to data loss as there exists no schema identifier.
To fix the issue, following changes are proposed
Modify the metricbeat code to support schema
SELECT schema, digest_text, count_star, avg_timer_wait, max_timer_wait, last_seen, quantile_95
FROM performance_schema.events_statements_summary_by_digest
ORDER BY avg_timer_wait DESC
LIMIT 10
Modify the integration ingest pipeline to use schema additionally for fingerprint computation
MySQL performance datastream make use of the following query.
As part of the testing, it is noticed that when the same SQL statements are run across multiple database schema, especially when the database traffic is less, will lead to duplicate values for digest_text .
Reference : https://dev.mysql.com/doc/refman/8.0/en/performance-schema-statement-summary-tables.html
events_statements_summary_by_digest has SCHEMA_NAME and DIGEST columns. Each row summarizes events per schema and digest value. (The DIGEST_TEXT column contains the corresponding normalized statement digest text, but is neither a grouping nor a summary column.
After TSDB enablement, this would lead to data loss as there exists no schema identifier.
To fix the issue, following changes are proposed
Modify the metricbeat code to support
schema
SELECT schema, digest_text, count_star, avg_timer_wait, max_timer_wait, last_seen, quantile_95 FROM performance_schema.events_statements_summary_by_digest ORDER BY avg_timer_wait DESC LIMIT 10
Modify the integration ingest pipeline to use
schema
additionally for fingerprint computation