Closed onderkalaci closed 2 years ago
Thanks, I did not have to create a ticket :)
@onderkalaci I wanted to write the high level steps for how pg13 support was added: 1-> Enable postgres 14 in configure 2-> Copy ruleutils13 as ruleutils14 and apply the relevant changes from postgres14 on top of ruleutils14, (note that this diverges from the postgres ruleutils, see https://github.com/citusdata/citus/issues/3855 3-> Compile citus with postgres 14, to solve the compilation issues:
version_compat.h
as much as possible instead of inlining PG_VERSION_NUM
macros inside the code.4-> After Citus compiles with postgres 14, run the tests and for each failing test:
5-> After all tests pass, for each postgres 14 changelog:
pg14.sql
pg14.sql
.6-> Add postgres 14 test images to our CI
Some issues I hit while testing #5209
[x] Creating multi-column extended statistics crashes
CREATE TABLE t1 (c1 int, c2 int);
SELECT create_distributed_table('t1', 'c1');
CREATE STATISTICS extstat1_data1 ON MOD(c1, 10),
MOD(c2, 10) FROM t1 ;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
Time: 5.304 ms
[x] pg_stat_progress_copy doesn't show tuples_processed when COPY dist/ref table TO (Skip for now, a little involved)
[ ] pg_stat_progress_copy doesn't much when COPY dist/ref table FROM XXX (#5218)
[ ] pg_stat_progress_copy doesn't work well for local execution as we re-start many COPY commands in the same backend where the state is cleared
[x] We do not propagate compressed columns to the shards #5222
CREATE TABLE compress1(c1 TEXT COMPRESSION pglz, c2 TEXT COMPRESSION pglz) ;
SELECT create_distributed_table('compress1', 'c1');
\d+ compress1
Table "public.compress1"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+------+-----------+----------+---------+----------+-------------+--------------+-------------
c1 | text | | | | extended | pglz | |
c2 | text | | | | extended | pglz | |
Access method: heap
-- connect to worker -- check shard \d+ compress1_102113 Table "public.compress1_102113" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description --------+------+-----------+----------+---------+----------+-------------+--------------+------------- c1 | text | | | | extended | | | c2 | text | | | | extended | | | Access method: heap
- [x] We do not propagate ALTER COLUMN .. SET COMPRESSION #5227
```SQL
ALTER TABLE compress1 ALTER COLUMN c2 SET COMPRESSION pglz ;
ERROR: alter table command is currently unsupported
CREATE TABLE t3 (key int, c1 numeric);
CREATE INDEX idx1_data1 ON t3 USING brin
(c1 numeric_bloom_ops (false_positive_rate = 0.05,
n_distinct_per_range = 100)) ;
select create_distributed_table('t3', 'key');
-- fails
CREATE INDEX idx1_data2 ON t3 USING brin
(c1 numeric_bloom_ops (false_positive_rate = 0.05,
n_distinct_per_range = 100)) ;
ERROR: citus currently doesn't support operator class parameters in indexes
Time: 1.177 ms
CREATE TABLE users_table_part(user_id bigint, value_1 int, value_2 int) PARTITION BY RANGE (value_1);
CREATE TABLE users_table_part_0 PARTITION OF users_table_part FOR VALUES FROM (0) TO (1);
CREATE TABLE users_table_part_1 PARTITION OF users_table_part FOR VALUES FROM (1) TO (2);
SELECT create_distributed_table('users_table_part', 'user_id');
alter table users_table_part detach partition users_table_part_0 concurrently;
ERROR: ALTER TABLE ... DETACH CONCURRENTLY cannot run inside a transaction block
CONTEXT: while executing command on localhost:9702
Time: 10.850 ms
[x] CURRENT_ROLE can be used instead of CURRENT_USER fails weirdly
GRANT CREATE ON SCHEMA sc1 TO CURRENT_ROLE;
ERROR: unexpected role type 1
[x] REINDEX on partitioned tables fail #5229
CREATE TABLE part1(c1 NUMERIC, c2 VARCHAR(10)) PARTITION BY RANGE (c1);
SELECT create_distributed_table('part1', 'c1');
CREATE TABLE part1v1 PARTITION OF part1 FOR VALUES FROM (0) TO (100000) ; CREATE TABLE part1v2 PARTITION OF part1 FOR VALUES FROM (100000) TO (200000) ;
CREATE INDEX idx1_part1 ON part1(c1) ; REINDEX TABLE part1 ;
ERROR: REINDEX TABLE cannot run inside a transaction block CONTEXT: while reindexing partitioned table "public.part1_102942" while executing command on localhost:9701 Time: 12.581 ms
- [x] Recursive CTE deparse problem #5234
```SQL
CREATE TABLE graph0(f INT, t INT, label TEXT);
SELECT create_distributed_table('graph0', 'f');
INSERT INTO graph0 VALUES (1, 2, 'arc 1 -> 2'),
(1, 3, 'arc 1 -> 3'), (2, 3, 'arc 2 -> 3'),
(1, 4, 'arc 1 -> 4'), (4, 5, 'arc 4 -> 5') ;
WITH RECURSIVE search_graph(f, t, label) AS (
SELECT * FROM graph0 g WHERE f = 1
UNION ALL
SELECT g.* FROM graph0 g, search_graph sg WHERE
g.f = sg.t and g.f = 1
) SEARCH DEPTH FIRST BY f, t SET seq
SELECT * FROM search_graph ORDER BY seq ;
ERROR: column reference "seq" is ambiguous
CONTEXT: while executing command on localhost:9701
WITH RECURSIVE search_graph(f, t, label) AS (
SELECT * FROM graph0 g WHERE f = 1
UNION ALL
SELECT g.* FROM graph0 g, search_graph sg WHERE g.f = sg.t and g.f = 1
) SEARCH BREADTH FIRST BY f, t SET seq
SELECT * FROM search_graph ORDER BY seq ;
ERROR: column reference "seq" is ambiguous
CONTEXT: while executing command on localhost:9701
Time: 19.813 ms
set citus.replication_model TO streaming ;
CREATE TABLE test (a int); SELECT create_distributed_table('test', 'a');
create or replace procedure proc_pushdown(dist_key integer, OUT created int4[])
language plpgsql
as $$
DECLARE
res INT := 0;
begin
INSERT INTO test VALUES (dist_key);
SELECT count(*) INTO res FROM test;
RAISE NOTICE 'Res: %', res;
created := created || res;
commit;
end;$$;
SELECT create_distributed_function('proc_pushdown(integer)', 'dist_key', 'test' );
-- make sure that metadata is synced, it may take few seconds SELECT bool_and(hasmetadata) FROM pg_dist_node;
-- pushdown the procedure to the node which does a multi-shard query CALL proc_pushdown(1, '{}'::int4[]); ... DEBUG: pushing down the procedure DEBUG: opening 1 new connections to localhost:9701 DEBUG: established connection to localhost:9701 for session 292 in 6415 microseconds NOTICE: issuing CALL public.proc_pushdown(1) DETAIL: on server onderkalaci@localhost:9701 connectionId: 188 ERROR: procedure public.proc_pushdown(integer) does not exist HINT: No procedure matches the given name and argument types. You might need to add explicit type casts. CONTEXT: while executing command on localhost:9701 ...
create or replace procedure proc_regular(dist_key integer, OUT created int4[])
language plpgsql
as $$
DECLARE
res INT := 0;
begin
INSERT INTO test VALUES (dist_key);
SELECT count(*) INTO res FROM test;
RAISE NOTICE 'Res: %', res;
created := created || res;
commit;
end;$$;
{1} (1 row)
- [ ] Add several regression tests (incomplete list)
- [x] jsonb subscripting: #5232
- [x] join alias
```SQL
CREATE TABLE data1(c1 int, c2 int);
CREATE TABLE data2(c1 int, c2 int);
SELECT create_distributed_table('data1', 'c1');
SELECT create_distributed_table('data2', 'c1');
SELECT d1.c1, x.c1, x.c2 FROM data1 d1 JOIN data2 d2
USING (c1, c2) AS x ;
-- router DISTINCT rollup
select c1, c2
FROM
data2
WHERE c1=1
group by distinct rollup(c1, c2), rollup(c1, c3);
-- multi row INSERTs with DEFAULT used to fail
-- see https://github.com/postgres/postgres/commit/17958972fe3bb03454a4b53756b29d65dc285efa
CREATE TABLE gtest1 (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a * 2) STORED);
select create_distributed_table('gtest1', 'a');
INSERT INTO gtest1 VALUES (4, DEFAULT), (5, DEFAULT);
CREATE TABLE gtest1 (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a * 2) STORED);
select create_distributed_table('gtest1', 'a');
INSERT INTO gtest1 VALUES (4, DEFAULT), (5, DEFAULT);
-- Allow table-qualified variable names in ON CONFLICT ... WHERE.
-- https://github.com/postgres/postgres/commit/6c0373ab77359c94b279c4e67c91aa623841af65
create table insertconflicttest(key int4 PRIMARY KEY, fruit text);
SELECT create_distributed_table('insertconflicttest', 'key');
insert into insertconflicttest as t values (23, 'Blackberry') on conflict (key) where fruit like '%berry' and t.fruit = 'inconsequential' do nothing;
-- Add support for infinity and -infinity values to the numeric data type (Tom Lane)
create table t1(a int, b numeric);
insert into t1 values (1, 'infinity');
select create_distributed_table('t1', 'b');
insert into t1 values (1, 'infinity') RETURNING *;
select count(*), b from t1 group by b1;
I am going through the differences in PG14 grammar, and this is a comment is a WIP. I want us to go through this exhaustive list and make sure we are ok with all the changes.
Please do not update the checklist, as I will constantly add some more items to the list, and I do not want to overwrite any of your changes.
<name>
DETACH PARTITION <partition_name>
[CONCURRENTLY]<name>
DETACH PARTITION <partition_name>
FINALIZE<name>
ALTER [COLUMN] <colname>
SET COMPRESSION <cm>
COMMENT ON <object> IS <text>
may allow more object kinds nowHi @onderkalaci , @hanefi - Did we complete this work? If so, should we close this issue?
This was completed in #5209
As PG 14 is on the horizon, we should make Citus work with PG 14.
As a development strategy, I think we should follow a similar approach to PG 13 integration, where any change is documented with a separate commit and the commit has a reference to the related change in PG: https://github.com/citusdata/citus/pull/3900/commits
So that the review becomes much easier