iCyberon / pg_hashids

Short unique id generator for PostgreSQL, using hashids
MIT License
280 stars 24 forks source link

Truncated hash using varchar type under 10.x #15

Closed andrewgleave closed 4 years ago

andrewgleave commented 6 years ago

I'm working with a codebase which is being upgraded to run on Postgres 10.2 and am using a trigger to generate a hash based on the newly inserting row's id. e.g.

CREATE FUNCTION xy_create_hash() RETURNS trigger AS $$
    BEGIN
        NEW.hash := id_encode(NEW.id, 'mysalt', 6);
        RETURN NEW;
    END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER create_xy_hash BEFORE INSERT ON xy FOR EACH ROW EXECUTE PROCEDURE xy_create_hash();

However, the resulting hash is truncated to 4 valid chars and a DLE char which is encoded as %10 when viewed as part of the url the hash is used to generate.

http://127.0.0.1:3000/foo/w8lV%10/bar

Any idea what could be causing this?

iCyberon commented 6 years ago

@andrewgleave What's the result when you run id_encode(123, 'mysalt', 6) outside trigger?

iCyberon commented 6 years ago

Just tested your trigger.

CREATE TABLE "xy" (
    "id" serial,
    "hash" text,
    "data" text,
    PRIMARY KEY ("id")
);

CREATE FUNCTION xy_create_hash() RETURNS trigger AS $$
    BEGIN
        NEW.hash := id_encode(NEW.id, 'mysalt', 6);
        RETURN NEW;
    END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER create_xy_hash BEFORE INSERT ON xy FOR EACH ROW EXECUTE PROCEDURE xy_create_hash();

-- 
INSERT INTO xy (data) VALUES('test') RETURNING *;

The result is:

id hash data
3 rgn2gv test
andrewgleave commented 6 years ago

Sorry, I should have been clearer. I there appears to be no problem if I execute an insert via psql but only when this is executed from our system (a Go app) which wraps this in a transaction. The issue doesn't seem to effect postgres 9.x and only showed up when migrating to 10.2.

It is entirely possible this is nothing to do with this code, but I thought I'd ask in case.

Thanks for your help

andrewgleave commented 6 years ago

Yeah, there is something weird going on though I'm not sure what or where in the stack. For n == 1 I get a truncated hash, for n > 1 I get a valid 6 char hash. So, initial call is truncated and subsequent are all ok.

I don't see how it could be related to this lib since even the truncated hash seems get generated. Whether it's an internal pg bug I'm not sure.

package main

import (
    "database/sql"
    "fmt"
    "log"

    _ "github.com/lib/pq" //import pq
)

// create table xy
// (
//   id serial not null,
//   hash character varying(10),
//   constraint xy_pkey primary key (id)
// )
// with (
//   oids=false
// );
//
// CREATE or REPLACE FUNCTION xy_create_hash() RETURNS trigger AS $$
//     BEGIN
//         NEW.hash := id_encode(NEW.id, 'mysalt', 6);
//         RETURN NEW;
//     END;
// $$ LANGUAGE plpgsql;
//
// CREATE TRIGGER create_xy_hash BEFORE INSERT ON xy FOR EACH ROW EXECUTE PROCEDURE xy_create_hash();

func main() {
    con := fmt.Sprintf("user=%s password=%s dbname=%s port=%d sslmode=disable",
        "hash-test", "hash-test", "hash-test", 5432)
    db, err := sql.Open("postgres", con)
    if err != nil {
        log.Fatal(err)
    }

    n := 1 // 1 shows the issue, > 1 is ok
    for i := 0; i < n; i++ {
        var id int64
        r := db.QueryRow("insert into xy (hash) values ('') returning id;")
        err := r.Scan(&id)
        if err != nil {
            log.Fatalf("scan fail. err: %v", err)
        }
        log.Println(id)
    }
}
andrewgleave commented 6 years ago

Seems it is somewhat related to pg_hashids. If I replace the call to id_encode with some other function (anything else seems fine) the function executes and returns as expected.

For example, replacing id_encode(NEW.id, 'mysalt', 6); with concat('a', 'b', 'cde'); in the trigger.

If I can be of help let me know. I have no experience writing postgres extensions so I'm at a loss as to what I can do next (other than downgrade pg to 9.x).

iCyberon commented 6 years ago

I've never fully tested with 10.X, installing 10.2 now and will get back to you shortly.

iCyberon commented 6 years ago

@andrewgleave Just to confirm, does it work on 10.X when using psql?

andrewgleave commented 6 years ago

No, not always. It appeared to be fine with sql, but I've just tried it (after leaving the server idle for a while) and it's just failed. If I then try again with that connection it's works (as seems the case when used from Go).

iCyberon commented 6 years ago

@andrewgleave can you please change type of hash to text instead of character varying(10)?

iCyberon commented 6 years ago

Btw, I would recommend to use text as there's no performance difference between varchar and text, both of them use varlena under the hood.

There's a great article about char, varchar, and text here: https://www.depesz.com/2010/03/02/charx-vs-varcharx-vs-varchar-vs-text

andrewgleave commented 6 years ago

Ok great. Will try later.

Thanks for the links

Sent from my iPhone

On 2 Mar 2018, at 18:34, Vahagn Mkrtchyan notifications@github.com wrote:

Btw, I would recommend to use almost always use text as there's no performance difference between varchar and text, both of them use varlena under the hood.

There's a great article about char, varchar, and text here: https://www.depesz.com/2010/03/02/charx-vs-varcharx-vs-varchar-vs-text

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

andrewgleave commented 6 years ago

Ok, finally got around to updating the test case.

Setting the column to text appears to fix it in as far as I can see. I’ll roll out a migration to the system on Monday and then report back if the issue remains.

Any idea why this bug would occur for varchar types and not text?

On 2 Mar 2018, at 19:09, Andrew Gleave andrew@redrobotstudios.com wrote:

Ok great. Will try later.

Thanks for the links

Sent from my iPhone

On 2 Mar 2018, at 18:34, Vahagn Mkrtchyan <notifications@github.com mailto:notifications@github.com> wrote:

Btw, I would recommend to use almost always use text as there's no performance difference between varchar and text, both of them use varlena http://www.varlena.com/varlena.php under the hood.

There's a great article about char, varchar, and text here: https://www.depesz.com/2010/03/02/charx-vs-varcharx-vs-varchar-vs-text https://www.depesz.com/2010/03/02/charx-vs-varcharx-vs-varchar-vs-text — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/iCyberon/pg_hashids/issues/15#issuecomment-370012002, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGJB-YT0hMTVpMigXzbufzGf8Ni76Y7ks5taZCwgaJpZM4SZsrI.

iCyberon commented 6 years ago

Honestly, I have no idea why it happens. The extension returns text which should be converted to varchar when assigned to hash. It works correctly under 9.x, which means something got changed in 10.x. I'll keep the issue open until I find the reason.

andrewgleave commented 6 years ago

Can confirm altering the column type to text fixes the issue.

Thanks for your help

deadghost commented 6 years ago

Hit the same issue on 10.4. text also resolved the issue.

iCyberon commented 6 years ago

Ah, I need to find the underlying issue. Meanwhile will update the readme until the issue is resolved.

EvanCarroll commented 5 years ago

Has anyone reduced this to a simple test case? It helps if you actually provide some basic DDL to create the table, show us what you're doing to create the problem, and what you expect to happen. We have to reverse engineer the problem from the trigger -- that kind of sucks.

EvanCarroll commented 5 years ago

I can't validate or reproduce this on Pg 10.6. Can you create a script that we can run that shows us whether or not the problem exists and can we close out this issue pending more information?

waljohn commented 5 years ago

Ok. Here is a way that so-far has been able to reproduce on demand.

It also occurs with Postgres 11.3. However what's super interesting about this is that it appears that the "corruption" takes place ONCE per connection, not once-per transaction.

CREATE TABLE if not exists load_test (
    dtcreated timestamp NOT NULL,
    pk1 bigint generated always as identity, 
         -- ^ therefore, we can easily determine the insertion order.
    hash_id varchar NULL,
    CONSTRAINT load_test_pk1 PRIMARY KEY (pk1)
);

CREATE OR REPLACE FUNCTION load_test_hashid_bit()
    RETURNS trigger
    LANGUAGE plpgsql
AS $$
BEGIN
    new.hash_id := id_encode( extract(epoch from new.dtcreated)::bigint );
    return new;
END;
$$;

create trigger bit_hashid before insert on
    load_test for each row execute procedure load_test_hashid_bit();

insert into load_test (dtcreated) (
    select * from generate_series('2008-01-01 00:00'::timestamp,
    '2008-05-31 12:00', '1 hours')
   );

Result:

      dtcreated      | pk1 | hash_id
---------------------+-----+----------
 2019-01-01 00:00:00 |   1 | gxD9\x10
 2019-01-01 01:00:00 |   2 | gxD9Nvl
 2019-01-01 02:00:00 |   3 | gxD99mj
 2019-01-01 03:00:00 |   4 | gxD9q4G
 2019-01-01 04:00:00 |   5 | gxD9jE6
 2019-01-01 05:00:00 |   6 | gxD9oyZ
 2019-01-01 06:00:00 |   7 | gxD937r
 2019-01-01 07:00:00 |   8 | gxD9zZ3
 2019-01-01 08:00:00 |   9 | gxD9mB9
 2019-01-01 09:00:00 |  10 | gxD9wlD
 2019-01-01 10:00:00 |  11 | gxD9Aok
 2019-01-01 11:00:00 |  12 | gxD9v9Y
 2019-01-01 12:00:00 |  13 | gxD91yl
 2019-01-01 13:00:00 |  14 | gxDWJQj
 2019-01-01 14:00:00 |  15 | gxDWLpG
 2019-01-01 15:00:00 |  16 | gxDWZ16
 2019-01-01 16:00:00 |  17 | gxDWnwZ
 2019-01-01 17:00:00 |  18 | gxDWkor
 2019-01-01 18:00:00 |  19 | gxDW793
 2019-01-01 19:00:00 |  20 | gxDW6K9
 2019-01-01 20:00:00 |  21 | gxDWM6D
 2019-01-01 21:00:00 |  22 | gxDWDnk
 2019-01-01 22:00:00 |  23 | gxDWKYY
 2019-01-01 23:00:00 |  24 | gxDWVwl
 2019-01-02 00:00:00 |  25 | gxDWP0j

No matter how much you insert again, you'll still only get the first insertion messed up.

However, if you disconnect and reconnect...

insert into load_test (dtcreated) (
    select * from generate_series('2019-02-01 00:00'::timestamp,
    '2019-02-02 00:00', '1 hours')
   );

(disconnect/reconnect)

insert into load_test (dtcreated) (
    select * from generate_series('2019-03-01 00:00'::timestamp,
    '2019-03-02 00:00', '1 hours')
   );
select * from load_test
where pk1 in (1,2,26,27,51,52)
order by 2;

Result the 1st, 26th (first row from 2nd connection), and 51st (first row from 3rd connection) rows are corrupt.

      dtcreated      | pk1 | hash_id
---------------------+-----+----------
 2019-01-01 00:00:00 |   1 | gxD9\x10
 2019-01-01 01:00:00 |   2 | gxD9Nvl
 2019-02-01 00:00:00 |  26 | gxyQ\x10
 2019-02-01 01:00:00 |  27 | gxyQBK9
 2019-03-01 00:00:00 |  51 | gxyr\x10
 2019-03-01 01:00:00 |  52 | gxyrDYD

If you substitute beginning and committing transactions, for connection and disconnection, for example, this issue does not occur 3 times, only the first time.

What's so puzzling about this is that the trigger is for each row not for each statement. Therefore, this should be completely impossible and it makes me wonder if there's a lower-level issue here.

Should not be relevant, but FWIW:

 select * from version();
 PostgreSQL 11.3 on armv7l-unknown-linux-gnueabihf, compiled by gcc (Raspbian 6.3.0-18+rpi1+deb9u1) 6.3.0 20170516, 32-bit

and

select * from pg_extension
where extname='pg_hashids';
-[ RECORD 1 ]--+-----------
extname        | pg_hashids
extowner       | 10
extnamespace   | 2200
extrelocatable | t
extversion     | 1.2.1
extconfig      | ω
extcondition   | ω

all commands executed straight in PSQL

JaredReisinger commented 5 years ago

@iCyberon, I'm pretty sure that #20 was the cause of this one, too, just based on the description.