yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.95k stars 1.07k forks source link

[YSQL] Wrong error messages format being published for serialization error #17258

Open Karvy-yb opened 1 year ago

Karvy-yb commented 1 year ago

Jira Link: DB-6495

Description

Steps to repro:

  1. Create universe in version 2.18.0.0-b61 with enable_wait_queues and enable_deadlock_detection flag set to true.
  2. Create table: CREATE TABLE tb (k int primary key, v int);
  3. Insert some data to table: INSERT INTO tb VALUES (1,1), (2,2), (3,3);
  4. Transaction-1,2,3,4: BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ
  5. Transaction-1,2,3,4: SELECT * FROM tb WHERE k=1; (Dummy command)
  6. Transaction-1: UPDATE tb SET v=22 WHERE k=1;
  7. Transaction-2: SELECT * FROM tb WHERE k=1 FOR NO KEY UPDATE;
  8. Transaction-3: SELECT * FROM tb WHERE k=1 FOR KEY SHARE;
  9. Transaction-4: SELECT * FROM tb WHERE k=1 FOR UPDATE;
  10. Transaction-1: COMMIT;

Error thrown: ERROR: could not serialize access due to concurrent update Expected error message format:

ERROR: could not serialize access due to concurrent update
DETAIL:  e1f6c6f1-3f2a-400d-bc98-feb2228998d3 conflicts with committed transaction: 5e103c08-5716-4b91-88d0-7c8083a0a136

Issue is observed 2.18 onwards and doesn't exist for branches 2.14 and 2.16.

Warning: Please confirm that this issue does not contain any sensitive information

tvesely commented 11 months ago

The DETAIL field is not filled in because the error message is not generated inside the tserver, but is rather generated from within Postgres.

yugabyte=# SELECT * FROM tb WHERE k=1 FOR NO KEY UPDATE;
ERROR:  could not serialize access due to concurrent update
/home/dreddor/code/yugabyte-db/build/debug-clang16-dynamic-ninja/../../src/yb/yql/pggate/util/ybc_util.cc:346:     @     0x7f1e22027e5b  YBCGetStackTrace
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/utils/error/../../../../../../../src/postgres/src/backend/utils/error/elog.c:1112:     @     0x55e8fb49562d  errmsg
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/executor/../../../../../../src/postgres/src/backend/executor/nodeLockRows.c:269:     @     0x55e8fb04c2c2  ExecLockRows
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/executor/../../../../../../src/postgres/src/backend/executor/execProcnode.c:459:     @     0x55e8fb01cf01  ExecProcNodeFirst
../../../../../../src/postgres/src/include/executor/executor.h:272:                                     @     0x55e8fb015df1  ExecProcNode
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/executor/../../../../../../src/postgres/src/backend/executor/execMain.c:1726:     @     0x55e8fb010620  ExecutePlan
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/executor/../../../../../../src/postgres/src/backend/executor/execMain.c:357:     @     0x55e8fb0104d0  standard_ExecutorRun
/home/dreddor/code/yugabyte-db/src/postgres/contrib/pg_stat_statements/../../../../../src/postgres/contrib/pg_stat_statements/pg_stat_statements.c:1296:     @     0x7f1e1a3fbbdd  pgss_ExecutorRun
/home/dreddor/code/yugabyte-db/src/postgres/contrib/yb_pg_metrics/../../../../../src/postgres/contrib/yb_pg_metrics/yb_pg_metrics.c:508:     @     0x7f1e1a3b7e86  ybpgm_ExecutorRun
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/executor/../../../../../../src/postgres/src/backend/executor/execMain.c:295:     @     0x55e8fb01021a  ExecutorRun
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/tcop/../../../../../../src/postgres/src/backend/tcop/pquery.c:960:     @     0x55e8fb2a2cb4  PortalRunSelect
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/tcop/../../../../../../src/postgres/src/backend/tcop/pquery.c:789:     @     0x55e8fb2a26b8  PortalRun
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/tcop/../../../../../../src/postgres/src/backend/tcop/postgres.c:1220:     @     0x55e8fb29fd52  exec_simple_query
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/tcop/../../../../../../src/postgres/src/backend/tcop/postgres.c:4767:     @     0x55e8fb29d814  yb_exec_simple_query_impl
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/tcop/../../../../../../src/postgres/src/backend/tcop/postgres.c:4735:     @     0x55e8fb29d93e  yb_exec_query_wrapper_one_attempt
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/tcop/../../../../../../src/postgres/src/backend/tcop/postgres.c:4759:     @     0x55e8fb29d7e9  yb_exec_query_wrapper
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/tcop/../../../../../../src/postgres/src/backend/tcop/postgres.c:4782:     @     0x55e8fb298ce4  yb_exec_simple_query
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/tcop/../../../../../../src/postgres/src/backend/tcop/postgres.c:5407:     @     0x55e8fb2977e9  PostgresMain
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/postmaster/../../../../../../src/postgres/src/backend/postmaster/postmaster.c:4678:     @     0x55e8fb1bfd48  BackendRun
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/postmaster/../../../../../../src/postgres/src/backend/postmaster/postmaster.c:4316:     @     0x55e8fb1becf6  BackendStartup
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/postmaster/../../../../../../src/postgres/src/backend/postmaster/postmaster.c:1775:     @     0x55e8fb1bd7ce  ServerLoop
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/postmaster/../../../../../../src/postgres/src/backend/postmaster/postmaster.c:1431:     @     0x55e8fb1ba49a  PostmasterMain
/home/dreddor/code/yugabyte-db/src/postgres/src/backend/main/../../../../../../src/postgres/src/backend/main/main.c:234:     @     0x55e8fb0b3507  PostgresServerProcessMain
    @     0x55e8fb0b3ac1 
../csu/libc-start.c:308:                                                                                @     0x7f1e21bd3082  __libc_start_main
    @     0x55e8fac978ad 

This error message is identical to the one that Postgres would have returned. Is the DETAIL actually needed in these cases?

Karvy-yb commented 11 months ago

This breaks the uniformity of error messages that we are reporting. Also, if that's the case then we won't be able to validate the detail part of error messages in automation as @pkj415 had requested. cc: @m-iancu

Arjun-yb commented 1 month ago

@m-iancu

  1. It is repo-able in master(2.23.1.0-b74) with error message ERROR: could not serialize access due to concurrent update (query layer retry isn't possible because data was already sent, if this is the read committed isolation (or) the first statement in repeatable read/ serializable isolation transaction, consider increasing the tserver gflag ysql_output_buffer_size)
  2. Steps are as described above in the description of the issue.
  3. Test: TestYsqlRLGPWithTransactionTablesAndWaitQueues -- https://bitbucket.org/yugabyte/internal-services/src/6d1afa0cf788db1ded6ea9f6435c2c1cc755ba4c/itest/src/universe_tests/system_tests/test_ysql_row_level_geopartition.py#lines-245
  4. it is deterministic failure.
  5. is it the regression due to any of the fixes/code change ? cc: @kripasreenivasan