HypoPG / hypopg

Hypothetical Indexes for PostgreSQL
https://hypopg.readthedocs.io
Other
1.39k stars 59 forks source link

document how to enable hypo_use_real_oids #65

Closed piracio closed 2 years ago

piracio commented 2 years ago

Hi, I am trying to find how to enable the variable hypo_use_real_oids, the documentation mention this variable however it is unclear how to enable it.

rjuju commented 2 years ago

Hi,

hypopg.use_real_oids, like other documented configuration parameters are simply postgres configuration parameters, and can be configured like any other parameter. You can see an example on how to configure this parameter in the regression tests at https://github.com/HypoPG/hypopg/blob/REL1_STABLE/test/sql/hypopg.sql#L106:

SET hypopg.use_real_oids = on;

Do you have any specific reason to prefer using the real oid generator?

Note that since the "normal" hypopg usage is to indirectly load it when using the hypopg_create_index() function (rather than issuing an explicit LOAD 'hypopg'; for instance), the hypopg parameters don't even exist until that function is called. So while the above query would work (as setting a custom parameter is valid), you won't be able to rely on tab completion until hypopg has been loaded.

e-Evolution commented 2 years ago

Hi @rjuju , we tried HyPopg from 1.3.1 version and PG 14.4 , and MacOS aarm and I got this error:

context : we are executed dexter with HyPopg

Processing 31 new query fingerprints Index found: adempiere.ad_column (iskey) Processing 165 new query fingerprints ERROR: PG::InternalError: ERROR: hypopg: not more oid available HINT: Remove hypothetical indexes or enable hypopg.use_real_oids

image

Do you have hint to solve?

rjuju commented 2 years ago

Well, on pg15 there's room for having more than 2500 hypothetical indexes at the same time. It seems that the software you're using has some bruteforce approach on index suggestion, creating a lot of hypothetical indexes. It doesn't sound like a good idea and probably isn't efficient.

If you really want to keep using that tool I guess you don't have other choice than turning on use_real_oids and use it on your primary node.

You could also try pg_qualstats' index advisor, see https://github.com/powa-team/pg_qualstats/#functions:

pg_qualstats_index_advisor(min_filter, min_selectivity, forbidden_am): Perform a global index suggestion. By default, only predicates filtering at least 1000 rows and 30% of the rows in average will be considered, but this can be passed as parameter. You can also provide an array of index access method if you want to avoid some. For instance, on PostgreSQL 9.6 and prior, hash indexes will be ignored as those weren't crash safe yet.

Example:


SELECT v
FROM json_array_elements(
pg_qualstats_index_advisor(min_filter => 50)->'indexes') v
ORDER BY v::text COLLATE "C";
v
---------------------------------------------------------------
"CREATE INDEX ON public.adv USING btree (id1)"
"CREATE INDEX ON public.adv USING btree (val, id1, id2, id3)"
"CREATE INDEX ON public.pgqs USING btree (id)"
(3 rows)

SELECT v FROM json_array_elements( pg_qualstats_index_advisor(min_filter => 50)->'unoptimised') v ORDER BY v::text COLLATE "C"; v

"adv.val ~~* ?" (1 row)



It relies on another extension, but use your real workload to find missing indexes rather than bruteforcing them, and tries to suggest multi-column indexes when possible to avoid creating too many indexes.
e-Evolution commented 2 years ago

@rjuju Thank you , I tried added the hypopg.use_real_oids = on into my postgresql.conf and is working fine.

I wonder what is the best approach , just I saw your answer now , I will try your recommendation.

Thanks you very much, for your quick response.

piracio commented 2 years ago

Thanks, the software that we are using is dexter and probably is doing force brute approach.

rjuju commented 2 years ago

Good news!

I don't really know how dexter is working, but if it only relies on postgres planner using a hypothetical indexes to decide if such an index should be created, it seems dangerous: if there's a similar existing index, there's no guarantee that this one will be picked by postgres rather than the hypothetical index. If anything, postgres more likely to do the opposite, as the real index will probably be bloated, and thus bigger, so more expensive than the hypothetical one (which may also be under estimated).

For pg_qualstats, I wrote some article to explain its approach: https://rjuju.github.io/postgresql/2020/01/06/pg_qualstats-2-global-index-advisor.html which also covers the caveats and limitations.

piracio commented 2 years ago

Thanks for all the information, I am playing with automatic index creation tools as I am workign improving the performance of Adempiere. so far is going OK, now I will deep dive into dexter and the tools that you just provide, thanks again. I will close this question. Saludos, desde New Zealand de un Chileno.