Closed gozdal closed 2 years ago
Thanks for this report. Yes, it was always the intention that alternate bloat queries be supported. While pgmetrics uses this one, there are other queries like this and this for example.
Will investigate this query you proposed, and see how it can be integrated.
As for collecting pg_catalog metrics, perhaps an option can be added, but will take it up as another issue/PR.
Is there a way to omit the bloat query from ./pgmetrics reports? I receive the error pretty quickly after execution, and have not been able to generate a report.
pgmetrics: bloat query failed: context deadline exceeded.
@agolden3 There is not, currently. But I suppose it does make sense to add an omit option for bloat; it can be time consuming.
You can either try increasing the timeout using the -t option, or rebuild pgmetrics with this line commented out.
@agolden3 An option to omit bloat query is now available in the just released v1.13.0. Use "--omit=bloat".
I am trying to use
pgmetrics
on a big (10TB+), busy (1GB/s RW) database. It takes around 5 minutes for pgmetrics to run. I traced the problem to the "bloat query" spinning in CPU, doing no I/O.I have traced the problem to the bloated
pg_class
(the irony:pgmetrics
does not collect bloat onpg_catalog
):vacuum (full, analyze, verbose) pg_class;
pg_class
has so many dead rows because the workload is temp-table heavy (creating/destroying 1M+ temporary tables per day) and has long running analytics queries running for 24h+.PG query planner assumes that index scan on
pg_class
will be very quick and plans Nested loop with Index scan. However, the index scan has 7M dead tuples to filter out and the query takes more than 200 seconds.If I create a temp table from
pg_class
to contain only the live tuples:and run the bloat query on
pg_class_alive
instead ofpg_class
:it runs in 10s, 20x faster
WDYT about adding such a hack (maybe under an option?) to
pgmetrics
?