powa-team / powa-web

PoWA user interface
http://powa.readthedocs.io/
73 stars 31 forks source link

Wrong index suggestion on UPDATE statement #151

Closed arnobnq closed 2 years ago

arnobnq commented 2 years ago

Hello,

While analyzing index suggestions from hypopg on queries executed by a typical pgbench benchmark, we noticed something strange on the following UPDATE statement :

UPDATE pgbench_accounts SET abalance = abalance + ? WHERE aid = ?;

Powa suggests after clicking "Optimize database! " that we create an index on the aid column of the pgbench_accounts table. That doesn't make sense because this column is a primary key, hence it is already indexed :

optimize_cropped

This suggestion does not appear in the "Predicates" tab of the query analysis though :

predicates_cropped

Steps to reproduce :

Using powa version 4.1.2 :

createdb pgbench
psql -d pgbench -c 'CREATE EXTENSION hypopg;'
pgbench -i pgbench
pgbench -c 1 -T 900 pgbench

After a few minutes of the benchmark running, clicking the "Optimize database" makes this suggestion appear.

Is this a bug or misconfiguration?

Thanks!

rjuju commented 2 years ago

Hi,

Yes, that's not surprising and the reason for that is your dataset size.

The index suggestion is only based on pg_qualstats information, not the catalog. So I think that with a scale factor of 1 (since you didn't specify a scale factor in your pgbench -i command), the table is so small that the index for the primary isn't used, and that's detected by powa. In other scenario it could also be due to an index not existing at the time the query was run but could have been created since.

Clearly that's something that could be improved, but in general you shouldn't blindy create any index suggested by any tool. Also, if you want to check some index suggestion tool you should definitely use a realistic dataset, as otherwise the result are not going to make much sense anyway.

arnobnq commented 2 years ago

Thanks a lot for your answers.

I have increased my scale factor to 100 just to be safe, but got similar results, not sure that is the issue here. Your second guess, on the other hand, was spot on. I did have this index missing at one point because I wanted to see if powa suggested to create an index after dropping the primary key. I then added a primary key again and ran further benchmarks, that's when the issue occurred.

Such alterations of the schema could be rather common due to performance tuning. What do you suggest as a way to work around this issue? Should one just keep powa.retention low to prevent false positives from past states of the schema?

rjuju commented 2 years ago

Ah I see. It's unclear to me why you have a different result on both pages then. Are both using the same time interval?

Such alterations of the schema could be rather common due to performance tuning. What do you suggest as a way to work around this issue? Should one just keep powa.retention low to prevent false positives from past states of the schema?

It's usually not a good idea to drop and recreate indexes. If you want to see how your application would behave without an index, you can use this extension: https://github.com/postgrespro/plantuner to "hide it" for a specific session / user / database without actually dropping it. That being said the result would be the same for powa, as it would see sequential scan access vs index scan.

The best way to avoid this should be to use the datetime picker and make sure that you display 2 different and non overlapping time ranges.

arnobnq commented 2 years ago

Thank you very much for these inputs. We will work around the issue by configuring the time interval properly, as you suggested.