Open michaelpq opened 8 months ago
Thanks for the report @michaelpq
After we spoke offline, I spent time looking further into this. What I observed is the follows:
skipping over setup_scan_method_enforcement https://github.com/ossc-db/pg_hint_plan/blob/master/pg_hint_plan.c#L3955-L3956 does not prevent the setting up of parallel enforcement https://github.com/ossc-db/pg_hint_plan/blob/master/pg_hint_plan.c#L3986 I confirmed with some additional debugging statements.
Also, Testing the repro case on 1.4.1, I discovered some very inconsistent/odd parallel enforcement behavior that 684986a exposes in 1.4.2, but I do not think is the cause of. See the cases below.
postgres=# \dx
List of installed extensions
Name | Version | Schema | Description
--------------+---------+------------+------------------------------
pg_hint_plan | 1.4.1 | hint_plan |
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(2 rows)
set parallel_setup_cost=0;
set parallel_tuple_cost=0;
set min_parallel_index_scan_size to 0;
set min_parallel_table_scan_size=0;
set max_parallel_workers_per_gather To 4;
I have a repro of the test case mentioned in the thread.
postgres=# /*+
IndexScan(a bookings_total_amount_idx)
IndexScan(b bookings_tmp_not_exist)
Parallel(a 4 hard)
Parallel(b 2 hard)
*/
EXPLAIN(costs false) SELECT * FROM bookings a,bookings_tmp b
WHERE a.book_ref = b.book_ref and a.total_amount > 28000.00;
QUERY PLAN
-------------------------------------------------------------------------------
Nested Loop
Join Filter: ((a.book_ref)::text = (b.book_ref)::text)
-> Gather
Workers Planned: 4
-> Parallel Index Scan using bookings_total_amount_idx on bookings a
Index Cond: (total_amount > 28000.00)
-> Gather
Workers Planned: 2
-> Parallel Seq Scan on bookings_tmp b
(9 rows)
Here is where things get interesting. If only the Parallel hints are supplied, no parallel hint is enforced.
postgres=# /*+
Parallel(a 4 hard)
Parallel(b 2 hard)
*/
EXPLAIN(costs false) SELECT * FROM bookings a,bookings_tmp b
WHERE a.book_ref = b.book_ref and a.total_amount > 28000.00;
QUERY PLAN
----------------------------------------------------------
Nested Loop
Join Filter: ((a.book_ref)::text = (b.book_ref)::text)
-> Seq Scan on bookings a
Filter: (total_amount > 28000.00)
-> Seq Scan on bookings_tmp b
(5 rows)
I can force the parallel plan by keeping only index scans enabled. But, notice it's not a Parallel Seq scan on "b", but rather a Parallel Index Scan.
set enable_mergejoin = off;
set enable_seqscan = off;
set enable_hashjoin = off;
SET
SET
SET
postgres=# /*+
Parallel(a 6 hard)
Parallel(b 2 hard)
*/
EXPLAIN(costs false) SELECT * FROM bookings a,bookings_tmp b
WHERE a.book_ref = b.book_ref and a.total_amount > 28000.00;
QUERY PLAN
---------------------------------------------------------------------------
Nested Loop
Join Filter: ((a.book_ref)::text = (b.book_ref)::text)
-> Gather
Workers Planned: 6
-> Parallel Index Scan using bookings_pkey on bookings a
Filter: (total_amount > 28000.00)
-> Gather
Workers Planned: 2
-> Parallel Index Scan using bookings_tmp_pkey on bookings_tmp b
(9 rows)
It is clear that parallel enforcement, including the "hard" option, is not showing the expected results even in 1.4.1
thoughts?
Regards,
Sami
For now, I would like to choose a revert of https://github.com/ossc-db/pg_hint_plan/commit/684986acc3b3156aaf22c1571cd44dd34059a346 & friends on stable branches, getting back to the previous-still-somewhat-broken behavior.
With the above said, I do agree to revert the patch on stable branches, since the existing behavior, while not great, is what the users expect.
It is clear that parallel enforcement, including the "hard" option, is not showing the expected results even in 1.4.1.
Yeah. The fact that we would completely discard the Parallel hints when mixing more than one table sounds strange to me.
At this stage, I think that we'd better document all that in the shape of tests where one could look at a regression.diffs to see the difference generated when tweaking the code, or development is not going to scale well. I'm OK if we add comments like "strangely, this ignores both Parallel hints, perhaps these should be pushed under two Gather nodes". Or stuff like that.
I'll go put the stable branches in their previous stable with a revert, as I doubt that we've appreciated all the plan manipulations possible yet. Changing that in a major release is less or an issue to me. At this stage I'd rather prioritize plan stability to not impact users across these minor updated.
I agree. I'll submit a patch for Parallel Hint tests ( including those mentioned above ).
Attached are tests that cover several of the gaps discovered in this discussion:
1/ Ensure that Parallel enforcement does not break when a non-existent index is used in an IndexScan hint. 2/ Demonstrate that Parallel hints cannot be enforced on empty tables.
Thanks for the patch.
The patch you are proposing for the additional tests has no explanation about what's being done, like the interactions between the non-existing indexes and the hints. It may be good to document now with some big XXX that some of these plans are weird because of YYY.
I am not sure that it is a good idea to use and empty the existing tables t5 and t6 for these new tests. Perhaps splitting that into a new file would make more sense?
The patch you are proposing for the additional tests has no explanation about what's being done,
I hesitated doing that as it's not the norm to add much documentation to the tests, but that is not a good reason not to. I will add a better description.
I am not sure that it is a good idea to use and empty the existing tables t5 and t6 for these new tests. Perhaps splitting that into a new file would make more sense?
I am creating new tables for these tests in the sql/ut-init.sql file. I did not think it was worth adding a new init and test files for these few tests.
+CREATE TABLE s1.t5 (LIKE s1.t1 INCLUDING ALL);
+CREATE TABLE s1.t6 (LIKE s1.t1 INCLUDING ALL);
After second thought, showing that the non-existing index does not prevent parallel hint enforcement is not necessary. So, the test now shows the important part. For empty tables, parallel hints can only be enforced on index scans and not sequential scans.
Attached is the revised test.
Regards,
a3646e1b073c still exists on HEAD and now PG17. Now is the time to do a re-evaluation of what has been committed, knowing the reasons why the patch had to be backed out of past stable branches.
Looking at the discussion above, this patch exposed pre-existing odd behavior with parallel enforcement. Especially, if there are no rows in a table, parallel sequential scans cannot be enforced. This scenario is covered by the suggested test case [1]. Do you think there is value in these test?
In the case mentioned at the top of the thread, 1.4.1 is able to perform a parallel sequential scan on an empty table by "accident". Because an IndexScan hint is used on bookings_tmp_not_exist
, the sequential scan is disabled by pg_hint_plan ( enable_seqscan = off ). However, because bookings_tmp_not_exist
is not an existing index, this leaves Postgres with only the sequential scan access method. The use of parallel with the sequential scan is surprising. I was not able to reproduce the same plan by forcing parallelism.
set max_parallel_workers_per_gather To 4;
set parallel_setup_cost=0;
set parallel_tuple_cost=0;
set min_parallel_index_scan_size to 0;
set min_parallel_table_scan_size=0;
EXPLAIN SELECT * FROM bookings a,bookings_tmp b
WHERE a.book_ref = b.book_ref and a.total_amount > 28000.00;
I think this patch is a good improvement and I am not sure if the pre-existing behavior is the behavior that we want to maintain for the next major version release.
A user has reported a regression in the latest release series regarding subject, related to commit 684986acc3b3156aaf22c1571cd44dd34059a346 (applies to all stable branches). Imagine two simple tables:
Now here is an exotic case:
Under 1.4.1, we are getting this plan:
I think that this is not correct, for two reasons:
Now, under 1.4.2, we get this plan:
This is better in the fact that IndexScan(b bookings_tmp_not_exist) is marked as unused. Still, this is worse in the fact that the Parallel hint with the number of workers gets entirely ignored, leading to a regression.
I've been considering a couple of options about what to tweak in the case where an index restriction is applied, but my mind comes down to a few conclusions: