HypoPG / hypopg

Hypothetical Indexes for PostgreSQL
https://hypopg.readthedocs.io
Other
1.39k stars 59 forks source link

Hypopg is not working properly #13

Closed rafagalinari closed 5 years ago

rafagalinari commented 6 years ago

Hi Team, first of all thanks a lot to open this channel for outstanding issues.

We have input all data into the POWA repository but unfortunately Hypopg is not suggesting anything on queries. When looking into the postgresql Logs, sounds like it is not parsing the Constant values and we got an sintaxe error message.

We are under 10.1 version and the last POWA + Archivist + Hypopg version. We followed the install guide point to point with no issues.

Would you mind guys helping us on understanding this?

Thanks a lot. Rafael.

rafagalinari commented 6 years ago

ADD. We are not getting suggestions when pushing the button Optimizing Database and the error of sintaxe we saw it was exactly when pushing the button.

rjuju commented 6 years ago

Hi,

Thanks!

That's a strange issue. Can you get suggestion if you try on a per-query basis? Also, can you share some query example you're trying to get suggestions on, and the error logs?

rafagalinari commented 6 years ago

Hi, thanks for your prompt response. I think I figured out what happened:

There is no qualstats at all:

select count(1) from pg_qualstats where dbid=27140; count

 0

(1 row)

But we have the extension installed

axnrefi6=# \dx List of installed extensions Name | Version | Schema | Description --------------------+---------+------------+------------------------------------------------------------------- adminpack | 1.0 | pg_catalog | administrative functions for PostgreSQL btree_gist | 1.5 | public | support for indexing common datatypes in GiST dblink | 1.2 | public | connect to other PostgreSQL databases from within a database hypopg | 1.1.0 | public | Hypothetical indexes for PostgreSQL ocipg | 1.0 | ocipg | Oracle connection functions. pg_qualstats | 1.0.3 | public | An extension collecting statistics about quals pg_stat_kcache | 2.0.3 | public | Kernel statistics gathering pg_stat_statements | 1.5 | public | track execution statistics of all SQL statements executed pg_trgm | 1.3 | public | text similarity measurement and index searching based on trigrams plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language powa | 3.1.1 | public | PostgreSQL Workload Analyser-core

The question is: Why qualstat did not gathered info? Is there any further required grant?

Thanks again!

rjuju commented 6 years ago

Each time powa-archivist makes a snapshot, it'll reset the pg_qualstats entries, so if there hasn't been any activity since the last snapshot it's normal.

Also, by default pg_qualstats will sample only 1/max_connections queries. If you don't have a lot of queries run, probably pg_qualstats won't have any information.

You can see the actual stored information in the tables powa_qualstats_quals_history and powa_qualstats_quals_history_current.

Also, it seems that you installed all the powa extensions in the "axnrefi6" database, did you configure powa.database to point to it?

Finally, can you show the value of shared_preload_libraries?

rafagalinari commented 6 years ago

Many thanks again.

When you say powa pointing what do you mean? The queries executed during our workload are all appearing in POWA dashboard related to the axnrefi6 database.

-- Our shared_preload_libraries

Add settings for extensions here

shared_preload_libraries = 'pg_stat_statements,powa,pg_stat_kcache,pg_qualstats,hypopg' -- And others POWA related powa.frequency = 5min powa.retention = 7d

powa.coalesce = 1000

pg_qualstats.sample_rate = 1
pg_quallstats.enabled=true pg_qualstats.max=500 pg_qualstats.track_pg_catalog=true pg_qualstats.resolve_oids=true pg_qualstats.track_constants=true track_io_timing = on

And you were right about tables and storing, Thanks for sharing this.

But on this following example:

powa=# select count(1) from powa_qualstats_quals_history where dbid=27140; count

2248

powa=# select count(1) from powa_qualstats_constvalues_history_current where dbid=27140; count

232 (1 row)

powa=# select count(1) from powa_qualstats_constvalues_history_current where dbid=27140 and queryid=1367940420; count

 0

(1 row)

powa=# select count(1) from powa_qualstats_constvalues_history where dbid=27140 and queryid=1367940420; count

 0

This is the query we want and we need to tune. However, when we try to find the Constants used on it (Binds) it did'n find. And thats the part we see on log files related to Sintaxe error (Pushing the Optmizer button), like below:

2018-03-06 11:06:02.734 UTC [46966] ERROR: syntax error at or near ")" at character 288 2018-03-06 11:06:02.734 UTC [46966] STATEMENT: EXPLAIN SELECT ID, CASETYPE, CONTRACT, COUNTRYORGANIZATIONID, CREATEDBY, CREATIONTIMESTAMP, DESCRIPTION, EXPRESSION, KEYNAME, MEMBERID, OWNERROLE, PARTNERROLE, PROCESS, PROCESSSTEP, RULETYPE, STATUS, TASKROLE, LASTUPDATETIMESTAMP, LASTUPDATEDBY, VALUE FROM BRERULE WHERE ((((MEMBERID = ?) OR ((MEMBERID = ?) OR ((MEMBERID = ?) OR ((MEMBERID = ?) OR ((MEMBERID = ?) OR (((((((((MEMBERID = ?) OR (MEMBERID = ?)) OR (MEMBERID = ?)) OR (MEMBERID = ?)) OR (MEMBERID = ?)) OR (MEMBERID = ?)) OR (MEMBERID = ?)) OR (MEMBERID = ?)) OR (MEMBERID = ?))))))) AND (((PROCESS = ?) OR (PROCESS IS NULL)) AND (((PROCESSSTEP = ?) OR (PROCESSSTEP IS NULL)) AND (((OWNERROLE = ?) OR (OWNERROLE IS NULL)) AND (((TASKROLE = ?) OR (TASKROLE IS NULL)) AND ((CASETYPE = ?) OR (CASETYPE IS NULL))))))) AND (STATUS <> ?)) ORDER BY PROCESS ASC, PROCESSSTEP ASC, KEYNAME ASC

Many thanks on helping us understanding POWA.

Regards, Rafael

rafagalinari commented 6 years ago

ADD - That query on sintaxe error s exactly the query I was seeking for its constants on POWA dictionary.

rafagalinari commented 6 years ago

(Really sorry about the comment post. The format it is not working properly)

rjuju commented 6 years ago

So yes, this query has never been sampled by pg_qualstats. However the UI should handle this case and ignore queries that don't have any const recorded.

If you don't have a too high TPS rate, you could set "pg_qualstats.sample_rate = 1" in the postgresql.conf file, and reload the configuration. This way all queries will get sampled and you'll be able to use the wizard. You can set it back to -1 afterwards if sampling all queries causes too much overhead.

However, I think that if you really want to optimize this query, you should rewrite it, since any OR conditions can only be processed as multiple index scans and a BitmapOr of them.

rafagalinari commented 6 years ago

Hi, that is exactly what I am doing on the other ones BUT, the other ones I have bind values. I am more concerned on having samples than using the advisor.

In any case the sample rate was always set to 1. Since from the beginning...

Thanks a lot for your help.,

rjuju commented 6 years ago

Ah yes sorry I missed that.

pg_qualstats don't sample any OR-ed qualifier, since you can't make any assumption of the selectivity in such cases. I'm not sure sure how it'll behave with a mix of OR-ed and AND-ed qualifier as in this query.

If only one set of sample value is enough, pg_qualstats should track one original query string (the first encountered) for each distinct (queryid, userid, dbid) in shared memory (to be precise, the pg_qualstats.max most recent one), and they're not removed on a pg_qualstats_reset() call. You can use "select * from pg_qualstats_example_query(1367940420)" to retrieve it.

I'll try to reproduce this kind of query locally to see how pg_qualstats behave.

rjuju commented 6 years ago

Ok, so I tried to reproduce your table and query, and I do get records in pg_qualstats:

# select * from pg_qualstats_pretty;
 left_schema | left_table | left_column |   operator    | right_schema | right_table | right_column | occurences | execution_count | nbfiltered 
-------------+------------+-------------+---------------+--------------+-------------+--------------+------------+-----------------+------------
 public      | brerule    | casetype    | pg_catalog.=  | <NULL>       | <NULL>      | <NULL>       |          1 |               0 |          0
 public      | brerule    | memberid    | pg_catalog.=  | <NULL>       | <NULL>      | <NULL>       |         14 |               0 |          0
 public      | brerule    | ownerrole   | pg_catalog.=  | <NULL>       | <NULL>      | <NULL>       |          1 |               0 |          0
 public      | brerule    | process     | pg_catalog.=  | <NULL>       | <NULL>      | <NULL>       |          1 |               0 |          0
 public      | brerule    | processstep | pg_catalog.=  | <NULL>       | <NULL>      | <NULL>       |          1 |               0 |          0
 public      | brerule    | status      | pg_catalog.<> | <NULL>       | <NULL>      | <NULL>       |          1 |               0 |          0
 public      | brerule    | taskrole    | pg_catalog.=  | <NULL>       | <NULL>      | <NULL>       |          1 |               0 |          0
(7 rows)

Can you make manual checks to see if you get results or not? If you don't, I'm wondering if it's because of certain executor nodes that may not be handled by pg_qualstats.

rjuju commented 6 years ago

Hi @rafagalinari do you still have this issue?