ossc-db / pg_hint_plan

Extension adding support for optimizer hints in PostgreSQL
Other
696 stars 103 forks source link

Parser to read multiple hints at different places in the query #101

Closed WinterUnicorn closed 1 year ago

WinterUnicorn commented 2 years ago

We have several cases where we need hints to be anywhere inside query and at the same time we need analysis to understand that hints are not part of string literals and we need to be able to disable some hints using nested comments.

We have compound queries composed from small queries with generated aliases that must be validly hinted too. The most convenient thing is to write hints directly in these parts of the queries when moving and merge hints to the top of the result query it is a too complex.

The new "hints_anywhere" flag that was introduced in the master branch is not suitable for us because it will disable validations (for example hints inside quoted literals or nested comments) and does not allow parse all hints in the query.

This pull request provides an parsing algorithm that extracts all hints in query (not just the first ones) with the following rules:

The parser does not check the syntax and proceeds with the assumption that current query is valid (In fact requests with invalid syntax do not reach the hint_planner hook). This allows us make only simple checks without deep validation.

This feature can be enabled by GUC pg_hint_plan.enable_state_hint_extractor.

michaelpq commented 1 year ago

This ticket is a new feature. We are prioritizing stability for the next release as there are a few bugs that need to be taken care of, so this will be dealt later, and probably only on master, meaning PostgreSQL 16~.

michaelpq commented 1 year ago

FWIW, while thinking about this patch, I am a bit confused that all this coded logic is actually necessary. This pretty much looks like psql's parsing logic that discards comments located inside query strings, no? Couldn't we use a smarter logic similar to psqlscan.l, then?

michaelpq commented 1 year ago

FWIW, while thinking about this patch, I am a bit confused that all this coded logic is actually necessary. This pretty much looks like psql's parsing logic that discards comments located inside query strings, no? Couldn't we use a smarter logic similar to psqlscan.l, then?

I have been discussing this issue with others, and we find this proposal confusing because it makes impossible the tracking of the relation context if it were to become a thing in this module. Note that Oracle has a different positioning policy, where hints are only allowed after SELECT, UPDATE, INSERT and DELETE so as the context is kept around. We cannot really do that without plugging in more of the PostgreSQL parser here, unfortunately.

Following that, it is pretty clear to me that this ought to be rejected. I have been working on polishing a patch that adds a proper lexer to pg_hint_plan to be able to extract the hints from the queries using the same parsing rules as PostgreSQL itself.