Closed petergeoghegan closed 8 months ago
One more thing: If you run BenchmarkSQL with more than about 1000 warehouses in Postgres, it's a very good idea to set backend_flush_after=2MB -- especially if you have also increased the duration of checkpoints (a much more obvious optimization).
If you're going to run the longer experiment that I suggested as a way of validating the PR, then I recommend running with that setting. It's pretty much a pure win for BenchmarkSQL, at least at that DB size. This should make the difference that I described clearer.
(The best argument against the current definition of the index is probably just the simple fact that it doesn't appear to ever be chosen by the optimizer with common Postgres + BenchmarkSQL configurations.)
I've noticed that the index
bmsql_oorder_idx1
isn't very useful on Postgres. In fact, it generally isn't used at all with more than about 1000 warehouses, at least with my optimizedpostgresql.conf
, which is used with Postgres git HEAD -- this can be confirmed by queryingpg_stat_user_indexes
. (I have communicated with @wieck about this privately.)It is possible to completely remove the current
bmsql_oorder_idx1
index without apparent downside on Postgres, which ensures that the orders table gets HOT updates -- clearly a big benefit. However, it would be even more beneficial (particularly for query latency) if we were to revise the index's definition instead, while accepting non-HOT updates as a cost worth paying to enable index only scans. The TPC-C workload seems to highly reward this approach. It's not like VACUUM does all that badly with keeping visibility map bits set. (Though it could do a lot better, which is something that I'm working to address in Postgres itself.)This approach is particularly beneficial for the ORDER_STATUS transaction, which can be confirmed fairly easily -- just run BenchmarkSQL with a couple of hundred warehouses for a few hours to see the difference (cannot confirm that it'll work as well on all Postgres versions, but I think it should work the same on Postgres 11+).
It will be even more informative if you can run with this revised index definition with 1000+ warehouse, for 6+ hours. That will allow you to see a systemic improvement. The broader improvement has a lot to do with the fact that VACUUM doesn't need to delete any index pages in this revised
bmsql_oorder_idx1
index -- whereas it has to delete a large number of them today. That's pretty pathological, just on its own.I realize that this change doesn't belong under
sql.common
-- it's really a vendor-specific optimization that applies only to PostgreSQL 11+. We can deal with that after basic validation. My guess is that the currentbmsql_oorder_idx1
index is actually a vendor-specific optimization for Oracle -- the current definition is not justified by the spec, and already seems roughly in the same spirit as this revised definition.Thanks