Open timothy-e opened 8 months ago
This is not about PL/PGSQL or SQL.. but whether the function is "pure" or not.
If I mark the PL/PGSQL function IMMUTABLE, similar to the SQL function:
CREATE OR REPLACE FUNCTION ret_true() RETURNS boolean AS $$
BEGIN
RETURN true;
END;
$$ LANGUAGE plpgsql IMMUTABLE;
^^^^^^^^^
then we get the more optimal behavior of a single flush.
Even with IMMUTABLE, this test case shows high number of flushes:
DROP TABLE IF EXISTS load_temp;
create table load_temp (value text);
insert into load_temp select 'abcdef'
from generate_series(1,1000) a;
CREATE OR REPLACE FUNCTION is_date(input_string VARCHAR, format_string VARCHAR)
RETURNS BOOLEAN AS $$
BEGIN
perform TO_DATE(input_string, format_string);
RETURN TRUE;
EXCEPTION WHEN OTHERS THEN
RETURN FALSE;
END;
$$ LANGUAGE plpgsql IMMUTABLE;
DROP TABLE IF EXISTS test_load;
EXPLAIN (ANALYZE, DIST) CREATE TABLE test_load AS
SELECT
public.is_date(value,'YYYYMMDD') col1
FROM load_temp;
shows:
Storage Flush Requests: 1001
but if we change:
public.is_date(value,'YYYYMMDD') col1
to say:
public.is_date('a','YYYYMMDD') col1
then the number of flushes drop to:
Storage Flush Requests: 2
Jira Link: DB-8313
Description
Schema:
Using the plpgsql function, we have 20k writes and 20k flushes.
Using the SQL function, we have 20k write requests and 4 flushes:
Issue Type
kind/enhancement
Warning: Please confirm that this issue does not contain any sensitive information