citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.61k stars 671 forks source link

Postgres 12 related issues #2869

Closed serprex closed 5 years ago

serprex commented 5 years ago

PRs:

Test new features:

Feature Works? Comment
Inline CTEs Yes used throughout test suite
Foreign keys to partitioned tables ? Hard to figure out how to satisfy both postgres fk requirements alongside citus requirements. Citus has some poorly worded error messages here like "Foreign keys are supported in two cases, either in between two colocated tables including partition column in the same ordinal in the both tables or from distributed to reference tables" & another one implying that foreign keys must be to distributed tables
JSON path operations Yes
Generated columns No create_distributed_table ERROR: cannot use column reference in DEFAULT expression. Created #2879 to add support
Pluggable table storage Rejects Tested with blackhole_am, ERROR: only heap AM is supported. heapam.c in pg throws this. They'd like to eventually make it an assert, so not an error we should rely on in the long run
WHERE clause in COPY FROM No #2878 We execute the COPY ignoring the WHERE. Need to update CheckCopyPermissions. Add whereClause to copyState from BeginCopyFrom
REINDEX CONCURRENTLY Yes Not sure what needs to be checked re locking, only observed being able to query from distributed table where without CONCURRENTLY it'd block
index progress Yes Also working during REINDEX CONCURRENTLY
COMMIT AND CHAIN / ROLLBACK AND CHAIN Yes ROLLBACK variant seems to not work in general, even without citus installed at all
VACUUM TRUNCATE INDEX_CLEANUP SKIP_LOCKED Yes See specific commit in #2844 for commands/vacuum.c
onderkalaci commented 5 years ago

Can we add regression tests (or point to existing regression tests) for the things you've tried to #2844?

serprex commented 5 years ago

2844 only has the objective to implement pg12 support for existing features. I'll be creating separate issues for each specific feature

onderkalaci commented 5 years ago

Maybe also add VACUUM's new options TRANCATE/index_cleanup to not lose track

onderkalaci commented 5 years ago

We should probably add some tests with explicitly MATERIALIZED ctes, both for router queries that @serprex has worked on the previous release or multi-shard queries

WITH s1 AS MATERIALIZED (SELECT FROM users_table WHERE user_id = 1) SELECT FROM s1;

and

WITH s1 AS MATERIALIZED (SELECT FROM users_table WHERE user_id = 1) SELECT FROM s1;

just for completeness

onderkalaci commented 5 years ago

I've went over the new features, please see my [IMPORTANT] tags below, we should address those.

Most of my comments below are some examples which I think valuable to have as a regression test. They are just sanity checks that things work fine.

0) WHERE clause in COPY FROM: [IMPORTANT]: As I noted in the PR, we should error out for now because it is behaving wrong at the moment. https://github.com/citusdata/citus/pull/2844/files#r314717100

1) REINDEX CONCURRENTLY

CREATE TABLE test (x INTEGER);
INSERT INTO test SELECT generate_series(1, 1000);
SELECT create_distributed_table('test', 'x');

set citus.log_remote_commands TO ON;
SET client_min_messages TO DEBUG4;

CREATE INDEX i_test ON test (x);

---- [IMPORTANT]: Citus does not propage REINDEX CONCURRENTLY!!!!
-- this only happens on the coordinator table
-- and we should probably either error out or fully implement
REINDEX INDEX CONCURRENTLY i_test;

REINDEX TABLE test;
REINDEX SCHEMA public;
REINDEX DATABASE postgres;
REINDEX DATABASE "ec2-user";

2) plan_cache_mode and prepared statements

CREATE TABLE test_table (id int, sum int, avg float);

SELECT create_distributed_table('test_table', 'id');

INSERT INTO test_table SELECT (i * random())::int % 10000, (i * random())::int % 10000, 
                                ((i * random())::int % 10000) * random() FROM generate_series(0, 1000000) i;

PREPARE p1 (INTEGER) AS SELECT sum(sum) FROM test_table WHERE id = $1;

SET plan_cache_mode TO 'auto';

EXECUTE p1(1);
EXECUTE p1(2);
EXECUTE p1(3);
EXECUTE p1(4);
EXECUTE p1(5);
EXECUTE p1(6);
EXECUTE p1(6);

SET plan_cache_mode TO 'force_generic_plan';

-- OOPS FAILURE
--  [IMPORTANT]: Can we not error out here?
 EXECUTE p1(1);
ERROR:  could not create distributed plan
DETAIL:  Possibly this is caused by the use of parameters in SQL functions, which is not supported in Citus.
HINT:  Consider using PL/pgSQL functions instead.

SET plan_cache_mode TO 'force_custom_plan';

EXECUTE p1(1);
EXECUTE p1(2);
EXECUTE p1(3);
EXECUTE p1(4);
EXECUTE p1(5);
EXECUTE p1(6);
EXECUTE p1(6);

3)CTE Materialization: I think you should consider adding these tests or others to the regression tests

-- some sample tables
CREATE TABLE single_hash_repartition_first (id int, sum int, avg float);
CREATE TABLE single_hash_repartition_second (id int, sum int, avg float);
CREATE TABLE ref_table (id int, sum int, avg float);

SELECT create_distributed_table('single_hash_repartition_first', 'id');
SELECT create_distributed_table('single_hash_repartition_second', 'id');
SELECT create_reference_table('ref_table');

INSERT INTO single_hash_repartition_first SELECT (i * random())::int % 10000, (i * random())::int % 10000, 
                                ((i * random())::int % 10000) * random() FROM generate_series(0, 1000000) i;

INSERT INTO single_hash_repartition_second SELECT (i * random())::int % 10000, (i * random())::int % 10000, 
                                ((i * random())::int % 10000) * random() FROM generate_series(0, 1000000) i;

INSERT INTO ref_table SELECT (i * random())::int % 10000, (i * random())::int % 10000, 
                                ((i * random())::int % 10000) * random() FROM generate_series(0, 1000000) i;

-- a sample router query with NOT MATERIALIZED
-- which pushes down the filters to the CTE

EXPLAIN ANALYZE WITH cte1 AS NOT MATERIALIZED
(
    SELECT 
        id
    FROM
        single_hash_repartition_first t1
)
SELECT
    count(*)
FROM
    cte1, single_hash_repartition_second
WHERE
    cte1.id = single_hash_repartition_second.id AND single_hash_repartition_second.id = 45;

-- same query, without NOT MATERIALIZED, which is already default
-- which pushes down the filters to the CTE

EXPLAIN ANALYZE WITH cte1 AS
(
    SELECT 
        id
    FROM
        single_hash_repartition_first t1
)
SELECT
    count(*)
FROM
    cte1, single_hash_repartition_second
WHERE
    cte1.id = single_hash_repartition_second.id AND single_hash_repartition_second.id = 45;

-- same query, with MATERIALIZED
-- which prevents pushes down the filters to the CTE
-- and hence becomes real-time query

EXPLAIN ANALYZE WITH cte1 AS MATERIALIZED
(
    SELECT 
        id
    FROM
        single_hash_repartition_first t1
)
SELECT
    count(*)
FROM
    cte1, single_hash_repartition_second
WHERE
    cte1.id = single_hash_repartition_second.id AND single_hash_repartition_second.id = 45;

-- similar query, with MATERIALIZED
-- now manually have the same filter in the CTE
-- and hence becomes  router query again
EXPLAIN ANALYZE WITH cte1 AS MATERIALIZED
(
    SELECT 
        id
    FROM
        single_hash_repartition_first t1
    WHERE 
        id = 45
)
SELECT
    count(*)
FROM
    cte1, single_hash_repartition_second
WHERE
    cte1.id = single_hash_repartition_second.id AND single_hash_repartition_second.id = 45;

-- now, have a real-time query w/wout MATERIALIZED
-- these are sanitiy checks, because all of the CTEs are recursively
-- planned and there is no benefit that Citus can have
EXPLAIN ANALYZE WITH cte1 AS MATERIALIZED
(
    SELECT 
        id
    FROM
        single_hash_repartition_first t1
    WHERE 
        sum = 45
)
SELECT
    count(*)
FROM
    cte1, single_hash_repartition_second
WHERE
    cte1.id = single_hash_repartition_second.id AND single_hash_repartition_second.sum = 45;

EXPLAIN ANALYZE WITH cte1 AS NOT MATERIALIZED
(
    SELECT 
        id
    FROM
        single_hash_repartition_first t1
    WHERE 
        sum = 45
)
SELECT
    count(*)
FROM
    cte1, single_hash_repartition_second
WHERE
    cte1.id = single_hash_repartition_second.id AND single_hash_repartition_second.sum = 45;

4) Partitioned tables can now be referenced as foreign keys: Can we just add regression tests to make sure that everything works fine

CREATE TABLE collections_list (
    key bigint,
    ts timestamptz,
    collection_id integer,
    value numeric,
    PRIMARY KEY(key, collection_id)
) PARTITION BY LIST (collection_id );

CREATE TABLE collections_list_0 
    PARTITION OF collections_list (key, ts, collection_id, value)
    FOR VALUES IN ( 0 );

CREATE TABLE collection_users 
    (used_id integer, collection_id integer, key bigint);

ALTER TABLE collection_users
    ADD CONSTRAINT collection_users_fkey FOREIGN KEY (key, collection_id) REFERENCES collections_list (key, collection_id);

-- sanity check for postgres
INSERT INTO collections_list VALUES (1,now(), 0, '1.1');
INSERT INTO collection_users VALUES (1, 0, 1);

-- should fail because of fkey
INSERT INTO collection_users VALUES (1, 1000, 1);

SELECT create_distributed_table('collections_list', 'key');
SELECT create_distributed_table('collection_users', 'key');

-- some more tests to ensure fkeys exists

5) The commands are COMMIT AND CHAIN and ROLLBACK AND CHAIN: I think we should consider adding the following or similar regression tests

CREATE TABLE test (x INTEGER, y int);
 INSERT INTO test (x,y) SELECT i,i from generate_series(1, 1000) i;

 SELECT create_distributed_table('test', 'x');

-- single shard queries with CHAIN
BEGIN;
UPDATE test SET y = 15 WHERE x = 1;
COMMIT AND CHAIN;
SELECT * FROM test WHERE x = 1;
COMMIT;

BEGIN;
UPDATE test SET y = 20 WHERE x = 1;
ROLLBACK AND CHAIN;
SELECT * FROM test WHERE x = 1;
COMMIT;

-- multi shard queries with CHAIN
BEGIN;
UPDATE test SET y = 25;
COMMIT AND CHAIN;
SELECT DISTINCT y FROM test;
COMMIT;

BEGIN;
UPDATE test SET y = 30;
ROLLBACK AND CHAIN;
SELECT DISTINCT y FROM test;
COMMIT;

6) Allow RECORD and RECORD[] to be used as column types in a query's column definition list for a table function that is declared to return RECORD: Well, we error out, but I don't think this is easy to fix.

select * from test where row(y) = ANY( ARRAY[row(3)]);
ERROR:  input of anonymous composite types is not implemented
CONTEXT:  while executing command on 10.192.0.200:5432

7) Add CREATE ACCESS METHOD command to create new table types

We error out on CopyLocalDataIntoShards(). I'd prefer to have an explicit check on create_distributed_table(). Now, we're relying on some unrelated function to error out. As you've noted above

8) Add support function capability to improve optimizer estimates, inlining, and indexing for functions: I couldn't find any examples. But, that'd be nice to have one example where we use such functions both on the coordinator plan and on the worker plan, just a sanity check that nothing brakes.

9) Allow pg_stat_statements_reset() to be more granular

Once merged to citus-enterprise, make sure that it doesn't break citus_stat_statements.

10) GENERATED ALWAYS AS [IMPORTANT]: https://github.com/citusdata/citus/pull/2844/files#r314716706

Can we error out on create_distributed_table (). Because it is not a nice UX where we error something really weird coming from the worker cannot use column reference in DEFAULT expression.

serprex commented 5 years ago
  1. Addressed
  2. I don't think REINDEX is being propagated at all, will look into
  3. Since default is unchanged I don't think this is an issue. People can also change options like changing the default table access method to 'btree' & then be unable to distribute tables created with default access method
  4. Added
  5. Added
  6. Added, READ ONLY test might be too flimsy
  7. Error message seems clear enough at least
  8. Addressed with error message. We should be able to support by going through tableam api, but that gets into murky territory with third party code & how deep the assumptions may run
  9. No comment
  10. Don't understand
  11. Confused about link, but I've updated create_distributed_table to error out sooner with a clear message

tl;dr outstanding: 1,8,9 (2 & 6 I believe to be non-issues)

onderkalaci commented 5 years ago
  1. I don't think REINDEX is being propagated at all, will look into

We should do that, users are excited about REINDEX, so this is probably one of the first things that people try out. As usual, I'm OK with erroring out for now, and implement it later.

  1. SET plan_cache_mode TO 'force_generic_plan';

I think differently here. Citus knows how to handle both generic and custom plans. So, we shouldn't be erroring out. Some pointers in the code for you to read about these concepts 1, 2

I think it is OK to leave this for now, but before 9.0, I think we should support that.

  1. Optimizer support functions

I can look into that, I'm curious to learn about them anyway.

  1. Don't understand

On Citus enterprise, there is a view called citus_stat_statements, which relies on pg_stat_statements. Given that there are some changes on the pg_stat_statements_reset (), we should also ensure that citus_stat_statements is not broken with pg 12. This could be as simple as adding a new regression test on citus-enterprise once the changes are merged to citus. My initial understanding is that there is nothing dangerous about the changes, but still should be verified.

serprex commented 5 years ago

Closing this issue, 8/9 are lacking their own issues, but no longer need a centralized tracking issue