shatteredsilicon / ssm-submodules

GNU Affero General Public License v3.0
0 stars 2 forks source link

Add "Transaction Analytics" #291

Open gordan-bobic opened 1 month ago

gordan-bobic commented 1 month ago

This should be presented in a similar-ish way as Query Analytics, but instead of simply tracking query performance, it should track open transactions.

This cannot be extracted from logs, but it can be extracted from performance_schema, e.g.

-- UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME='events_statements_history'; 

SELECT ps.id as processlist_id,
             trx_started, trx_isolation_level,
             esh.TIMER_WAIT,
             esh.event_name as EVENT_NAME,
             esh.sql_text,
             esh.RETURNED_SQLSTATE,
             esh.MYSQL_ERRNO,
             esh.MESSAGE_TEXT,
             esh.ERRORS,
             esh.WARNINGS
FROM information_schema.innodb_trx trx
JOIN information_schema.processlist ps ON trx.trx_mysql_thread_id = ps.id
LEFT JOIN performance_schema.threads th ON th.processlist_id = trx.trx_mysql_thread_id
LEFT JOIN performance_schema.events_statements_history esh ON esh.thread_id = th.thread_id;

This could be polled once per second and pushed upstream. I don't think it would be aggregatable, but it should be significantly smaller than query data. It will show queries, that those queries should be stubbed into templates and added to query classes (if they are not already there) and referenced from there (so as not to be storing them twice, once for TAN and once for QAN).

It would require some careful updating logic and openness tracking, e.g. on consecutive calls you might get the same transaction open and unchanged a few times, then an extra query might get added, and it then transaction will be committed or rolled back and will disappear from the list - at that point it is finished and gone. Storing that for presentation in an efficient way could be tricky, though there are unique tx IDs that could be used (each transaction could list multiple queries).

I am also not sure if we would want this integrated in the the mysql:queries exporter or have a separate mysql:transactions exporter. Not sure what dependencies there could be between them, but I guess query classes are unique so it wouldn't matter which of the two would add any particular query class. For example, if we have long_query_time set above 0, slow log won't catch some queries, but the transactions exporter still might if they are a part of a transaction it sees.

Transactions should be grouped/ordered in the same way as QAN, (sum of total execution time, or average execution time * number of executions) i.e. overall "impact" in the server, with same secondary sorting that was recently added to QAN (e.g. longest first, to find the infrequent but long transactions easily).

It is also important to show transactions that may still be in progress, e.g. TAN might see a transaction that has been open for hours or days and is still open at latest harvest. That should be shown in the TAN list.

For the two graphs at the top, on TAN they should be "Database Server Activity" (bottom one from QAN), and "InnoDB Transactions" from the "MySQL InnoDB Metrics" dashboard. "History Length" is the particularly important bit of information for spotting times where something that needs troubleshooting was running.

TAN and QAN should have cross-clickable links between them so that if we see a query in TAN we can click over to QAN for the same query/time frame and look at additional query information there, and similarly in QAN we may want to see some minimal transaction info and a clickable like to TAN to see what transaction(s) that query may be a part of.