pgsql-io / benchmarksql

A TPC-C like test tool
29 stars 11 forks source link

Make index on orders table much more useful on Postgres #16

Closed petergeoghegan closed 8 months ago

petergeoghegan commented 3 years ago

I've noticed that the index bmsql_oorder_idx1 isn't very useful on Postgres. In fact, it generally isn't used at all with more than about 1000 warehouses, at least with my optimized postgresql.conf, which is used with Postgres git HEAD -- this can be confirmed by querying pg_stat_user_indexes. (I have communicated with @wieck about this privately.)

It is possible to completely remove the current bmsql_oorder_idx1 index without apparent downside on Postgres, which ensures that the orders table gets HOT updates -- clearly a big benefit. However, it would be even more beneficial (particularly for query latency) if we were to revise the index's definition instead, while accepting non-HOT updates as a cost worth paying to enable index only scans. The TPC-C workload seems to highly reward this approach. It's not like VACUUM does all that badly with keeping visibility map bits set. (Though it could do a lot better, which is something that I'm working to address in Postgres itself.)

This approach is particularly beneficial for the ORDER_STATUS transaction, which can be confirmed fairly easily -- just run BenchmarkSQL with a couple of hundred warehouses for a few hours to see the difference (cannot confirm that it'll work as well on all Postgres versions, but I think it should work the same on Postgres 11+).

It will be even more informative if you can run with this revised index definition with 1000+ warehouse, for 6+ hours. That will allow you to see a systemic improvement. The broader improvement has a lot to do with the fact that VACUUM doesn't need to delete any index pages in this revised bmsql_oorder_idx1 index -- whereas it has to delete a large number of them today. That's pretty pathological, just on its own.

I realize that this change doesn't belong under sql.common -- it's really a vendor-specific optimization that applies only to PostgreSQL 11+. We can deal with that after basic validation. My guess is that the current bmsql_oorder_idx1 index is actually a vendor-specific optimization for Oracle -- the current definition is not justified by the spec, and already seems roughly in the same spirit as this revised definition.

Thanks

petergeoghegan commented 3 years ago

One more thing: If you run BenchmarkSQL with more than about 1000 warehouses in Postgres, it's a very good idea to set backend_flush_after=2MB -- especially if you have also increased the duration of checkpoints (a much more obvious optimization).

If you're going to run the longer experiment that I suggested as a way of validating the PR, then I recommend running with that setting. It's pretty much a pure win for BenchmarkSQL, at least at that DB size. This should make the difference that I described clearer.

(The best argument against the current definition of the index is probably just the simple fact that it doesn't appear to ever be chosen by the optimizer with common Postgres + BenchmarkSQL configurations.)