Closed bagafoot closed 2 years ago
@bagafoot this is expected – that's why it's called estimate.
pgstattuple performs scans, it's slow and causes intensive IO on large tables/indexes,
while the "estimated" approach is doing the job very fast. But sometimes quite wrong, yes. Simple example when it's wrong:
check_bloat=# create table t as select i::int4, 2::int8, 3::int4 from generate_series(1, 10000000) i;
SELECT 10000000
check_bloat=# vacuum analyze t;
VACUUM
check_bloat=# :dba
Menu:
0 – Node & current DB information: master/replica, lag, DB size, tmp files, etc.
1 – Databases: size, stats
2 – Tables: table/index/TOAST size, number of rows
3 – Load profile
a1 – Current activity: count of current connections grouped by database, user name, state
b1 – Table bloat (estimated)
b2 – B-tree index bloat (estimated)
b3 – Table bloat (requires pgstattuple; expensive)
b4 – B-tree indexes bloat (requires pgstattuple; expensive)
b5 – Tables and columns without stats (so bloat cannot be estimated)
e1 – Extensions installed in current DB
i1 – Unused and rarely used indexes
i2 – Redundant indexes
i3 – FKs with Missing/Bad Indexes
i4 – Invalid indexes
i5 – Cleanup unused and redundant indexes – DO & UNDO migration DDL
l1 – Lock trees (leightweight)
l2 – Lock trees, detailed (based on pg_blocking_pids())
p1 – [EXP] Alignment padding: how many bytes can be saved if columns are reordered?
s1 – Slowest queries, by total time (requires pg_stat_statements)
s2 – Slowest queries report (requires pg_stat_statements)
t1 – Postgres parameters tuning
v1 – Vacuum: current activity
v2 – Vacuum: VACUUM progress and autovacuum queue
q – Quit
Type your choice and press <Enter>:
b1
Is N/A | Table | Size | Extra | Bloat estimate | Live | Last Vaccuum | Fillfactor
--------+-------+--------+-----------------+-----------------+---------+---------------------+------------
| t | 498 MB | ~77 MB (15.43%) | ~77 MB (15.43%) | ~421 MB | 2022-01-04 11:28:12 | 100
(1 row)
-- here we have "gaps" in storage due to alignment padding (see, for example, here: https://docs.gitlab.com/ee/development/ordering_table_columns.html) and b1 thinks we have 15% bloat, while it's 0 on the fresh table.
If you have ideas on how to improve it – definitely welcome. Closing for now because the behavior is as expected and the report is marked with the word "estimated".
Hello, I'm comparing bloate results between pgstattuple and posgres_dba and its varying a lot. results from same table
pgstattuple: 1.93% postgrtesq_dba: 38.77%
commands: pgstattuple: SELECT pg_size_pretty(pg_relation_size('test_table')) as table_size, (pgstattuple('test_table')).dead_tuple_percent;
:dba: b1