citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.39k stars 661 forks source link

PG 16 support: Equivalence class changes break some queries #6895

Open onderkalaci opened 1 year ago

onderkalaci commented 1 year ago

As we plan ahead for PG 16 support, I realized that this commit https://github.com/postgres/postgres/commit/2489d76c4906f4461a364ca8ad7e0751ead8aa0d#diff-60dfe720257ddb41ac40a211ea8eb3ce37196ed01ab9102f37e75a75a66b996fR325-R337

Breaks some INSERT .. SELECT queries, such as

 INSERT INTO agg_results_second(user_id, value_2_agg)
SELECT user_id, value_2 FROM users_table as car WHERE
  user_id = 1 AND
   NOT EXISTS (SELECT user_id FROM events_table as foo WHERE user_id=car.user_id);

ERROR:  cannot perform distributed planning for the given modification
DETAIL:  Insert query cannot be executed on all placements for shard 102072

Because now, we cannot deduct that user_ids are always equal due to anti join.

Instead, PG16 decides that all user ids are equalivant to constant 1. That breaks our equivalence checks.

This is not a problem for router queries, as they are still concluded as push-down. However, INSERT .. SELECT does not have router logic, hence equivalence checks are broken.

Though, it is surpising that we can still pushdown:

 explain SELECT user_id, value_2 FROM users_table as car WHERE
   NOT EXISTS (SELECT user_id FROM events_table as foo WHERE user_id=car.user_id);

So, needs to double check what is going on.

onderkalaci commented 1 year ago

could be relevant: https://github.com/citusdata/citus/pull/6772

If the only thing that is broken is single shard INSERT .. SELECT, we could consider getting this in

tejeswarm commented 1 year ago

@onderkalaci Did this change break any of our existing regressions tests? I don't see any failure in the CI run of the PR we are reviewing.

tejeswarm commented 1 year ago

agg_results_second

I just realized we are not running pg-16 tests in the CI

tejeswarm commented 1 year ago

@onderkalaci is this test causing this issue? sql/multi_behavioral_analytics_single_shard_queries.sql

onderkalaci commented 1 year ago

@onderkalaci is this test causing this issue? sql/multi_behavioral_analytics_single_shard_queries.sql

I think at least that one. But @naisila also noted few other tests here: https://github.com/citusdata/citus/issues/7017

See could be related to https://github.com/citusdata/citus/issues/6895 line at the bottom of the issue description, which points to the potential other failures