yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.64k stars 1.04k forks source link

[YSQL] duplicate key value violation on identity key column #22935

Closed markpyb closed 1 day ago

markpyb commented 1 week ago

Jira Link: DB-11851

Description

When dropping a database with a sequence in it with ysql_sequence_cache_method=server, it is possible that the sequence cache is not invalidated, when the database is re-created with the application object (with a sequence) it leads to a situation where subsequent inserts can error with duplicate key value violation, on the identity key column

this seems present in 2024.1 and also 20.4

I observed it with a more complex test case , to observe the duplicate key value violation

create an app database with a identity key table and a secondary index on it, using server cache mode
run an pgbench with 120 parallel, each sqlfile is insert returning id, update where id
drop app database
re-create app database with the same ddl (apart from im slightly modifying the index each time for performance testing)
sometimes  after like 20 seconds
pgbench: error: client 16 script 0 aborted in command 54 query 0: ERROR:  duplicate key value violates unique constraint ""
pgbench: error: client 100 script 0 aborted in command 54 query 0: ERROR:  duplicate key value violates unique constraint ""
pgbench: error: client 50 script 0 aborted in command 54 query 0: ERROR:  duplicate key value violates unique constraint ""
pgbench: error: client 43 script 0 aborted in command 54 query 0: ERROR:  duplicate key value violates unique constraint ""

but a much simpler repro is possible to observe the behaviour when dropping the database

yugabyted destroy
yugabyted start --tserver_flags=ysql_sequence_cache_method=server
until postgres/bin/pg_isready -h $(hostname) ; do sleep 1 ; done | uniq
ysqlsh -h $(hostname)
\c yugabyte
create database d;
\c d
create sequence s;
select nextval('s');
select nextval('s');
\c yugabyte
drop database d;
create database d;
\c d
create sequence s;
select nextval('s');
yugabyte=# create database d;
CREATE DATABASE
yugabyte=# \c d
You are now connected to database "d" as user "yugabyte".
d=# create sequence s;
CREATE SEQUENCE
d=# select nextval('s');
 nextval
---------
       1
(1 row)

d=# select nextval('s');
 nextval
---------
       2
(1 row)

d=# \c yugabyte
You are now connected to database "yugabyte" as user "yugabyte".
yugabyte=# drop database d;
DROP DATABASE
yugabyte=# create database d;
CREATE DATABASE
yugabyte=# \c d
You are now connected to database "d" as user "yugabyte".
d=# create sequence s;
CREATE SEQUENCE
d=# select nextval('s');
 nextval
---------
       3
(1 row)

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

andrei-mart commented 1 week ago

It turns out to be a server side cache id conflict. Server uses sequence Oid as a cache id, it is unique in the database, but not in the cluster. But cache entry is for whole node, hence other clients connected to the same node may end up sharing same cache entry, regardless the database they are connected to, if they happen to have same Oid in their respective databases.
Simplest repro, one session connects to freshly created database:

seqtest1=# create sequence foo;
CREATE SEQUENCE
seqtest1=# select oid, relname from pg_class where relkind = 'S';
  oid  | relname 
-------+---------
 16384 | foo
(1 row)

seqtest1=# select nextval('foo');
 nextval 
---------
       1
(1 row)

seqtest1=# select nextval('foo');
 nextval 
---------
       2
(1 row)

seqtest1=# select nextval('foo');
 nextval 
---------
       3
(1 row)

other session, connected to the same node, but other database creates brand new sequence, but it looks used:

seqtest2=# create sequence bar;
CREATE SEQUENCE
seqtest2=# select oid, relname from pg_class where relkind = 'S';
  oid  | relname 
-------+---------
 16384 | bar
(1 row)

seqtest2=# select nextval('bar');
 nextval 
---------
       4
(1 row)

The problem affects clusters with gflag ysql_sequence_cache_method=server. By default ysql_sequence_cache_method=connection.

andrei-mart commented 1 week ago

Related to that, sequence cache may have stale entries from dropped sequences and dropped databases, and they may conflict with active sessions.