SQL state [25001]; error code [0]; ERROR: cannot perform query with placements that were modified over multiple connections; nested exception is org.postgresql.util.PSQLException: ERROR: cannot perform query with placements that were modified over multiple connections
case 1: Modify a reference table over connection 1, later modify a distributed table over connection 1 and connection 2 in parallel. Finally, joining these two tables would fail because the executor cannot choose the accurate connection. When it need to execute the join between table_1_2 and ref_table_1, it cannot choose any connections. If it chooses connection 1, the modification to table_1_2 would not be visible. If it chooses the connection 2, the modification to ref_table_1 would not be visible. Instead, the executor bails out with an erro
use-case
Table A (ref), Table B (dist)...
1) connection1: modification Table A (reference)
2) connection1, connection2 (parallel): modification Table B (distributed)
! joining table A and B <-- cannot perform query with placements that were modified over multiple connections error
example
BEGIN;
-- make a modification over connection 1 to worker 1
INSERT INTO ref_table VALUES (1,'istanbul');
-- copy over connections 1 to table_1_1 and connection 2 to table_1_2 on worker 1
\COPY table_1 FROM STDIN WITH CSV
10, 'istanbul'
11, 'istanbul'
\.
-- ref_table is modified over one connection 1
-- table_1 is modified over two connections connection 1 and connection 2
-- so the executor is undecided, it cannot pick either connection 1 or connection 2
-- because table_1_2 is modified over connection 2
-- but ref_table_1 is modified over connection 1
-- so choosing either connection 1 or connection 2 would yield wrong
-- results
SELECT count(*) FROM table_1 JOIN ref_table USING(key);
ERROR: cannot perform query with placements that were modified over multiple connections
END;
background
https://github.com/citusdata/citus/issues/3426
case 1: Modify a reference table over connection 1, later modify a distributed table over connection 1 and connection 2 in parallel. Finally, joining these two tables would fail because the executor cannot choose the accurate connection. When it need to execute the join between table_1_2 and ref_table_1, it cannot choose any connections. If it chooses connection 1, the modification to table_1_2 would not be visible. If it chooses the connection 2, the modification to ref_table_1 would not be visible. Instead, the executor bails out with an erro
use-case
Table
A
(ref), TableB
(dist)...1)
connection1
: modification TableA
(reference) 2)connection1, connection2
(parallel): modification TableB
(distributed)! joining table
A
andB
<-- cannot perform query with placements that were modified over multiple connections errorexample