iCyberon / pg_hashids

Short unique id generator for PostgreSQL, using hashids
MIT License
287 stars 25 forks source link

strange behaviour of "UPDATE" with id_encode() #33

Closed dataselfservice closed 1 year ago

dataselfservice commented 1 year ago

Hi,

(NOTE this is also posted here https://github.com/docker-library/postgres/issues/1122#issue-1896632578)

I'm running a kubernetes instance of 15.4-alpine3.18, with modified image IMG_POSTGRES_TAG added with pg_hashids via following Dockerfile:

FROM postgres:15.4-alpine3.18                                                                                                                                                                                                                 

RUN apk add --no-cache --virtual .build-deps build-base postgresql-dev clang15 llvm15 ; \
    wget -qO- https://github.com/iCyberon/pg_hashids/archive/refs/tags/v1.2.1.tar.gz | tar xzf - -C /tmp && \
    make -C /tmp/pg_hashids-1.2.1 && \
    make -C /tmp/pg_hashids-1.2.1 install && \
    rm -rf /tmp/pg_hashids-1.2.1 && \
    apk del .build-deps

Deployed to the cluster with:

helm install postgres --set image.repository=postgres,image.tag=${IMG_POSTGRES_TAG},primary.service.clusterIP=${POSTGRES_CLUSTER_IP},primary.persistence.existingClaim=postgres oci://registry-1.docker.io/bitnamicharts/postgresql

The issue is that a trigger function which generates hashes during insert, miss-behaves and generates a wrong hash ending with a space.

Reproduce it by running the following SQL:

BEGIN;
DROP SCHEMA IF EXISTS test CASCADE;
CREATE SCHEMA test;
CREATE TABLE test.test (
    id bigint NOT NULL,
    data text,
    ids text,
    ids1 text,
    ids2 text
);
CREATE SEQUENCE test.test_id_seq
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;
ALTER SEQUENCE test.test_id_seq OWNED BY test.test.id;
ALTER TABLE ONLY test.test
    ALTER COLUMN id SET DEFAULT nextval('test.test_id_seq'::regclass);
CREATE FUNCTION test.myfunc ()
    RETURNS TRIGGER
    LANGUAGE plpgsql
    AS $$
BEGIN
    UPDATE
        test.test
    SET
        ids = id_encode (NEW.id, 'hash1', 5, 'abcdefghijklmnopqrstuvwxyz'),
        ids1 = id_encode (NEW.id, 'hash2', 12, 'abcdefghijklmnopqrstuvwxyz'),
        ids2 = id_encode (NEW.id, 'hash3', 12, 'abcdefghijklmnopqrstuvwxyz')
    WHERE
        id = NEW.id;
    RETURN new;
END;
$$;
CREATE FUNCTION test.myfunc_fix ()
    RETURNS TRIGGER
    LANGUAGE plpgsql
    AS $$
BEGIN
    UPDATE
        test.test
    SET
        ids = id_encode (NEW.id, 'hash1', 5, 'abcdefghijklmnopqrstuvwxyz')
    WHERE
        id = NEW.id;
    UPDATE
        test.test
    SET
        ids1 = id_encode (NEW.id, 'hash2', 12, 'abcdefghijklmnopqrstuvwxyz'),
        ids2 = id_encode (NEW.id, 'hash3', 12, 'abcdefghijklmnopqrstuvwxyz')
    WHERE
        id = NEW.id;
    RETURN new;
END;
$$;
CREATE TRIGGER test_myfunc
    AFTER INSERT ON test.test
    FOR EACH ROW
    EXECUTE FUNCTION test.myfunc ();
\copy test.test to stdout CSV FORCE QUOTE *;
INSERT INTO test.test (data)
    VALUES ('a'),
    ('b');
\copy test.test to stdout CSV FORCE QUOTE *;
DROP TRIGGER test_myfunc ON test.test;
CREATE TRIGGER test_myfunc_fix
    AFTER INSERT ON test.test
    FOR EACH ROW
    EXECUTE FUNCTION test.myfunc_fix ();
\copy test.test to stdout CSV FORCE QUOTE *;
INSERT INTO test.test (data)
    VALUES ('a1'),
    ('b1');
\copy test.test to stdout CSV FORCE QUOTE *;
COMMIT

Run it with: cat test.sql | psq -U testdb.

Output looks like:

WARNING:  database "testdb" has no actual collation version, but a version was recorded
BEGIN
NOTICE:  drop cascades to 3 other objects
DETAIL:  drop cascades to table test.test
drop cascades to function test.myfunc()
drop cascades to function test.myfunc_fix()
DROP SCHEMA
CREATE SCHEMA
CREATE TABLE
CREATE SEQUENCE
ALTER SEQUENCE
ALTER TABLE
CREATE FUNCTION
CREATE FUNCTION
CREATE TRIGGER
INSERT 0 2
"1","a","zmnb ","wzejgmprmnpa","yzvmbrydnogd"
"2","b","pylw ","xabdomywmenl","zmbexnawnpaq"
DROP TRIGGER
CREATE TRIGGER
"1","a","zmnb ","wzejgmprmnpa","yzvmbrydnogd"
"2","b","pylw ","xabdomywmenl","zmbexnawnpaq"
INSERT 0 2
"1","a","zmnb ","wzejgmprmnpa","yzvmbrydnogd"
"2","b","pylw ","xabdomywmenl","zmbexnawnpaq"
"3","a1","zmpdm","jlyezmrwvakd","ybjwprgzndxl"
"4","b1","dmepy","ydqlpvjrvxnz","jwmvprxmnkea"
COMMIT

The issue is the space in the end of "zmnb ". Utilizing myfunc_fix() which basically runs separated UPDATE, works i.e. generates hashes without the trailing space. But I am not at all satisfied and I would like to understand what is the root cause?!?!

I noticed the WARNING: database "testdb" has no actual collation version, but a version was recorded and I am not sure how that is realted (I do not really know what is the impact of that.

Any help or ideas?

Cheers, DataSelfService team

dataselfservice commented 1 year ago

pg_hashids was not udpated to latest. This solved the problem:

RUN apk add --no-cache --virtual .build-deps build-base postgresql-dev clang15 llvm15 ; \
    wget -qO- https://github.com/iCyberon/pg_hashids/archive/cd0e1b31d52b394a0df64079406a14a4f7387cd6.tar.gz | tar xzf - -C /tmp && \
    make -C /tmp/pg_hashids-cd0e1b31d52b394a0df64079406a14a4f7387cd6 && \
    make -C /tmp/pg_hashids-cd0e1b31d52b394a0df64079406a14a4f7387cd6 install && \
    rm -rf /tmp/pg_hashids-cd0e1b31d52b394a0df64079406a14a4f7387cd6 && \
    apk del .build-deps