Closed Bensk1 closed 4 years ago
This seems to work better for a scale factor of 1. The multi-column index is estimated to have roughly 430 MB vs 150 MB for the single-column index.
Hello,
Oh, that's interesting. Did you run an analyze on the table?
Could you run this query and show the output?
select attname, avg_width from pg_stats where tablename = 'lineitem';
Yes, I did run analyye
and also tried with analyze lineitem
.
The result of your query:
tpch___10=# select attname, avg_width from pg_stats where tablename = 'lineitem';
attname | avg_width
-----------------+-----------
l_receiptdate | 4
l_tax | 4
l_extendedprice | 8
l_partkey | 4
l_orderkey | 4
l_suppkey | 4
l_shipdate | 4
l_returnflag | 2
l_quantity | 5
l_linenumber | 4
l_shipmode | 11
l_discount | 4
l_shipinstruct | 26
l_comment | 27
l_linestatus | 2
l_commitdate | 4
(16 rows)
Thanks! I think this is simply an overflow, and I unfortunately never tested an estimated size bigger than 4GB.
I'll look into it. BTW it's already clear that the estimated size will be twice as big as your index, but this is intended, as it tries to account for bloat that will happen once you start updating the index.
This should be fixed as of https://github.com/HypoPG/hypopg/commit/2e8cca5fa2fc8ce30e672e7ebbdc6f35a0e72946 or https://github.com/HypoPG/hypopg/commit/b41bd64e6956f77a1d37474f0cac1aab612a72d1, depending on whether you're using v1 or v2.
Thanks a lot for the report!
Thank you so much for your quick response and fix.
I have a TPC-H data set with a scale factor of 10 in Postgres 12 (12.1 (Ubuntu 12.1-1.pgdg19.10+1). If I create a multi-column index, it is in certain cases larger than a single-column index:
The results are the same if I exchange
l_partkey
, withl_orderkey
. If I physically create these indexes viaCREATE INDEX
, the multi-column index has 2.2 and the single-column 1.1 GB.