charles-001 / dolphie

Your single pane of glass for real-time analytics into MySQL/MariaDB & ProxySQL
https://pypi.org/project/dolphie/
GNU General Public License v3.0
589 stars 36 forks source link

Feature request: Metadata locks panel #49

Closed charles-001 closed 6 months ago

charles-001 commented 7 months ago

I'm looking to add a metadata lock panel and think this query works well - anyone have any thoughts on if this is the right approach?

SELECT
    OBJECT_INSTANCE_BEGIN AS id,
    OBJECT_TYPE,
    OBJECT_SCHEMA,
    OBJECT_NAME,
    LOCK_TYPE,
    LOCK_STATUS,
    SOURCE,
    PROCESSLIST_ID,
    PROCESSLIST_USER,
    PROCESSLIST_TIME,
    PROCESSLIST_INFO
FROM
    `performance_schema`.`metadata_locks` mlb JOIN
    `performance_schema`.`threads` t ON mlb.OWNER_THREAD_ID = t.THREAD_ID
WHERE
    NOT (OBJECT_TYPE = 'TABLE' AND LOCK_STATUS = 'GRANTED') AND
    OBJECT_TYPE != 'COLUMN STATISTICS'
ORDER BY
    PROCESSLIST_TIME DESC

This is what the result looks like:

Screenshot 2024-02-22 at 8 14 09 PM
aadant commented 6 months ago

I think you should add the thread name. It is interesting to know it is a replication thread versus a regular connection.

The first filter is not obvious, if the lock was granted and is held by the thread, it is interesting to know it is holding that lock for long time (example a python notebook opening a transaction by mistake !). Maybe sort by age desc and allow to filter / sort on the table name ?

aadant commented 6 months ago

@lefred please chime in :)

charles-001 commented 6 months ago

@aadant - good idea on showing the thread name. I've added that under Thread Source column. I'm stripping out the thread/sql, thread/innodb to conserve space.

As for showing all GRANTED locks for TABLE, I don't think that's feasible due to how many rows will be created. On my servers for instance, I can have 2,000 of them that are constantly changing each refresh interval. That isn't digestible for anyone. I don't really want to create filters for this panel as I don't think it's needed, but if you'd like to take a jab at working w/ Dolphie's code, you can. For now, I think a good compromise is to change the WHERE clause to something like this to make the panel digestible:

NOT (OBJECT_TYPE = 'TABLE' AND LOCK_STATUS = 'GRANTED' AND PROCESSLIST_TIME <= 2)

This will do what I originally wanted while showing the long-held locks like you want. Thoughts?

aadant commented 6 months ago

What about grouping them by thread id and lock type, status and group concat for tables ?

charles-001 commented 6 months ago

Not a bad idea, but Dolphie's table won't be able to show a lot of the tables concatted. I just tried doing it and GROUP_CONCAT(DISTINCT OBJECT_NAME) had 10 tables in it. Unfortunately, that many characters doesn't play well in a terminal table.

charles-001 commented 6 months ago

I changed the WHERE clause to this which I think limits it a bit better for filtering out spam:

NOT (
    OBJECT_TYPE = 'TABLE' AND
    LOCK_STATUS = 'GRANTED' AND
    LOCK_TYPE LIKE 'SHARED%' AND
    PROCESSLIST_TIME <= 2
) AND

I can't see a reason wanting to see any metalocks that relate to the above information below 2 seconds. Can you?

charles-001 commented 6 months ago

@aadant - I also just added the filters that the processlist panel uses. I didn't realize how easy it'd be ;)

charles-001 commented 6 months ago
SELECT
    ANY_VALUE(OBJECT_INSTANCE_BEGIN) AS id,
    OBJECT_TYPE,
    ANY_VALUE(OBJECT_SCHEMA) AS OBJECT_SCHEMA,
    CASE
        WHEN COUNT(DISTINCT OBJECT_NAME) = 1 THEN ANY_VALUE(OBJECT_NAME)
        WHEN COUNT(DISTINCT OBJECT_NAME) = 0 THEN NULL
        ELSE CONCAT(COUNT(DISTINCT OBJECT_NAME), ' objects')
    END AS OBJECT_NAME,
    LOCK_TYPE,
    LOCK_STATUS,
    ANY_VALUE(SOURCE) AS CODE_SOURCE,
    ANY_VALUE(NAME) AS THREAD_SOURCE,
    ANY_VALUE(PROCESSLIST_ID) AS PROCESSLIST_ID,
    ANY_VALUE(PROCESSLIST_USER) AS PROCESSLIST_USER,
    ANY_VALUE(PROCESSLIST_TIME) AS PROCESSLIST_TIME,
    ANY_VALUE(PROCESSLIST_INFO) AS PROCESSLIST_INFO
FROM
    `performance_schema`.`metadata_locks` mlb JOIN
    `performance_schema`.`threads` t ON mlb.OWNER_THREAD_ID = t.THREAD_ID
WHERE
    NOT (
        OBJECT_TYPE = 'TABLE' AND
        LOCK_STATUS = 'GRANTED' AND
        LOCK_TYPE LIKE 'SHARED%' AND
        PROCESSLIST_TIME <= 2
    ) AND
    OBJECT_TYPE != 'COLUMN STATISTICS'
GROUP BY
    THREAD_ID,
    OBJECT_TYPE,
    LOCK_TYPE,
    LOCK_STATUS

This seems to really cut down on garbage and from my testing, looks good.

charles-001 commented 6 months ago

This is now in v4.3.0

aadant commented 6 months ago

I see Object Name : 2 objects. No added value really :-)

If I run the query

select * from performance_schema.metadata_locks where owner_thread_id in (select thread_id from performance_schema.threads where name like '%worker%');

I can see which tables are being written by replication (maybe a different panel ?)

aadant commented 6 months ago

you can use group_concat and trim it, it is better than N objects.

charles-001 commented 6 months ago

@aadant - yes, I do that because that field will only allow one table to be listed due to space constraint. If there's only one, Dolphie will show it, but if there's more than one, it shows the count of objects as you see.

If you absolutely need to see that, then logging into a mysql client is probably best for more in-depth troubleshooting.

aadant commented 6 months ago

There are several issues in this UI design

the column name should match the performance schema. Rather than hiding information or tranform it you should display it. Maybe increase the column or just allow duplicates for the same thread (same color per thread id) with a filter to display replication threads only.

this is particularly useful to track metadata locks

aadant commented 6 months ago

I mean filter by thread name, table name

charles-001 commented 6 months ago

I've updated the datatable to now GROUP_CONCAT the values for object_name and list each of them on a new line for the row with a set height. This lets me be able to populate only 1 row, but with multiple lines for the object_name so the datable isn't filled with the same text for each row except the object_name. I also made the datatable zebra-striped so it's easier to read. I'll look to release this change in the next release which will probably be on Sunday night.

In the meantime, if you want to test this change @aadant, you can by cloning this repo, running poetry install, then poetry run dolphie

charles-001 commented 6 months ago

I've pushed the new version out @aadant so you won't need to run it via poetry to test my changes.