tembo-io / pgmq

A lightweight message queue. Like AWS SQS and RSMQ but on Postgres.
PostgreSQL License
2.65k stars 72 forks source link

Use `generated always as identity` instead of bigserial? #100

Closed craigpastro closed 11 months ago

craigpastro commented 1 year ago

Here, and in other places, bigserial is used: https://github.com/tembo-io/pgmq/blob/f8ae3e807c5bace58aff7da723fe3bd536aa28db/core/src/query.rs#L41

However, I think it is recommended to use generated always as identity over serial. See https://wiki.postgresql.org/wiki/Don't_Do_This#Don.27t_use_serial for some reasons why.

ddorian commented 1 year ago

I would like to leave the possibility of sending a client-generated id.

1 case is a distributed db like YugabyteDB, where sequence generation is more taxing and not as ordered as a single server (each server takes it's own cache).

2nd case is when using something like pgedge, you have 2+ clusters that replicate between them asynchronously.

In both those cases I would like a way to insert a custom id, and that id will probably just be "snowflake twitter id" which is 64bit and time-sorted, so should have the same functionality underneath.

ddorian commented 1 year ago

Another case (might) be logical replication where sequence-data is not replicated https://www.postgresql.org/docs/current/logical-replication-restrictions.html.

ChuckHend commented 1 year ago

Note: metrics use the sequences nextval to count number of messages that have hit the queue. We'll need to find a different way to get message count -- also, there is an issue with using nextval anyway, because if the .send() is in a transaction and its rolled back, the sequence will still increment. So in those cases, nextval is only an approximation.

ChuckHend commented 12 months ago

@craigpastro , we might be able to use generated as identity without impact to metrics. We still get an auto incrementing sequence created.

pgmq=# create table test (msg_id BIGINT GENERATED ALWAYS AS IDENTITY, name text);
CREATE TABLE
pgmq=# select * from test_msg_id_seq;
 last_value | log_cnt | is_called 
------------+---------+-----------
          1 |       0 | f
pgmq=# insert into test (name) values ('craig');
INSERT 0 1
pgmq=# select * from test_msg_id_seq;
 last_value | log_cnt | is_called 
------------+---------+-----------
          1 |      32 | t
(1 row)
ChuckHend commented 11 months ago

@ddorian - how would you feel about expanding on the queue types that can be created in order to support client-generated message IDs? For example, we have pgmq.create(), pgmq.create_unlogged(), pgmq.create_partitioned()....we could add pgmq.create() with a flag for client-generated IDs, or a pgmq.create_manual_assigned() or something that makes sense.

ddorian commented 11 months ago

I'm sorry but I don't have free time ATM.