citusdata / pg_shard

ATTENTION: pg_shard is superseded by Citus, its more powerful replacement
https://github.com/citusdata/citus
GNU Lesser General Public License v3.0
1.06k stars 63 forks source link

Support composite types for partition columns #79

Closed jasonmp85 closed 9 years ago

jasonmp85 commented 9 years ago

pg_shard does not automatically create custom types on worker nodes, but if they already exist it can use them in table definitions and queries. Unless the user tries to use them as the partition column. In that case this happens:

CREATE TYPE car_kind AS (make text, model text);
# CREATE TYPE
# Time: 6.044 ms

CREATE TABLE car_preferences (
    person_id integer,
    preference car_kind
);
# CREATE TABLE
# Time: 2.599 ms

SELECT master_create_distributed_table('car_preferences', 'preference');
# ERROR:  42704: data type car_kind has no default operator class for specified partition method
# DETAIL:  Partition column types must have a default operator class defined.
# LOCATION:  SupportFunctionForColumn, create_shards.c:599
# Time: 0.766 ms

Perhaps composite types can be special cased to a hash function of our choosing rather than relying on the default operator class.

onderkalaci commented 9 years ago

@jasonmp85 I did a simple experiment, and executed following after creating the type

SELECT hash_array (ARRAY[('merceders-benz','1979')::car_kind]);

And, the result is

ERROR:  could not identify a hash function for type car_kind

So, I think what we do is correct in terms of finding the hash function. Also, when I check and trace the code from this line, I see that the column has to have a Hash Support Function so that we can hash its value. And what I understand , it is not possible to define Support Function without defining a default opclass. Check section 35.14.3

Our problem is more related to types which already have a Hash Support Function and we cannot insert into it. For example, consider the following type:

-- create a custom type...
CREATE TYPE test_type AS (
    i integer,
    i2 integer
);

-- ... as well as a function to use as its comparator...
CREATE FUNCTION test_type_function(test_type, test_type) RETURNS boolean
AS 'SELECT TRUE;'
LANGUAGE SQL
IMMUTABLE
RETURNS NULL ON NULL INPUT;

-- ... use that function to create a custom operator...
CREATE OPERATOR = (
    LEFTARG = test_type,
    RIGHTARG = test_type,
    PROCEDURE = test_type_function
);

-- ... and create a custom operator family for hash indexes...
CREATE OPERATOR FAMILY test_op_family_3 USING hash;

CREATE FUNCTION DUMMY_HASH(test_type)
RETURNS int LANGUAGE internal AS 'hashtext' IMMUTABLE;

-- ... finally, build an operator class, designate it as the default operator
-- class for the type, but only specify an equality operator. So the type will
-- have a default op class but no hash operator in that class.
CREATE OPERATOR CLASS test_op_family_3_class
DEFAULT FOR TYPE test_type USING hash FAMILY test_op_family_3 AS
OPERATOR 1 =,
FUNCTION 1 DUMMY_HASH(test_type);

So, lets check hash function:

postgres=# SELECT hash_array (ARRAY[(1,2)::test_type]);
 hash_array 
------------
 -723094115
(1 row)

We saw that it has hash function, now create a table distribute it with pg_shard on that column:

postgres=# create table test_table (c1 test_type);
CREATE TABLE
postgres=#  SELECT master_create_distributed_table('test_table', 'c1');
 master_create_distributed_table 
---------------------------------

(1 row)
postgres=# SELECT master_create_worker_shards('test_table', 16, 2);
 master_create_worker_shards 
----------------------------- 
(1 row)

The previous step also works fine, all checks are passed. Now, lets try to insert into the table:

postgres=# insert into test_table VALUES ((1, 2)::test_type);
ERROR:  cannot plan sharded modification containing values which are not constants or constant expressions

I think the main problem is that we need to solve this error. What do you think?

jasonmp85 commented 9 years ago

Forgot to close this as part of last release.