Tencent / TBase

TBase is an enterprise-level distributed HTAP database. Through a single database cluster to provide users with highly consistent distributed database services and high-performance data warehouse services, a set of integrated enterprise-level solutions is formed.
Other
1.38k stars 262 forks source link

Unable to use immutable function in the partial functional index. #128

Open yazun opened 2 years ago

yazun commented 2 years ago

We found that a functional index with an immutable function does not work with copy API. It works while inserting normally. Marking the function as stable enables the index (at the performance price of course).

The snippet belows demonstrates this:


-- create a base table
drop table if exists test_copy;
create table test_copy (i smallint, t int, primary key(t)) distribute by shard(t);

-- create a lookup table
drop table if exists test_copy_fn_lookup ;
create table test_copy_fn_lookup(id int, val smallint, primary key(id) ) distribute by replication;
-- populate lookup
insert into test_copy_fn_lookup select i,i*2 from generate_series(1,100) i;

-- create a immutable lookup function
CREATE OR REPLACE FUNCTION getmainlookup()
 RETURNS smallint
 LANGUAGE sql
 IMMUTABLE
 -- STABLE <--- it would work with STABLE!
 PARALLEL SAFE
AS $function$
select val from test_copy_fn_lookup where id = 10;
$function$

-- create a fn index based on the above fn
create unique index uq_fn_test_copy on test_copy(t) where i = getmainlookup()

Then copy any data into the table:

echo "1,2" | psql -qX  -c "copy test_copy from stdin with csv;"
WARNING:  pgxc_node_begin gxid 43207 is invalid.
ERROR:  Could not begin transaction on data node:dn1.
CONTEXT:  SQL function "getmainlookup" statement 1

As in the comments, changing the function to STABLE will enable the copy, the error is gone but obviously performance drops. The error is visible in 2.2 and 2.3 as well...

JennyJennyChen commented 2 years ago

TBase(PGXL) does not support Index Predicate, because when CopyFrom data, the DN process is waiting for the arrival of data. At this time, it cannot receive other commands from CN. The DN stack is as follows: image

The existence of Index Predicate causes CN to enter the RelationGetIndexPredicate logic during CopyFrom, and then enter ExecRemoteQuery to send "SELECT val FROM test_copy_fn_lookup WHERE (id = 10) " to DN. CN stack information is as follows: image

However, DN is in the process of waiting to receive copy data and cannot process other Query messages sent by CN.

JennyJennyChen commented 2 years ago

Insert into is supported to insert one piece of data at a time, because the copy from logic is not executed, but when insert into multiple pieces, it will be automatically converted to copy from and the same problem will occur

JennyJennyChen commented 2 years ago

functions in index predicate can not be STABLE: image