yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
9.05k stars 1.08k forks source link

[YSQL][Upgrade] The catalog snapshot used for this transaction has been invalidated: MISMATCHED_SCHEMA #15768

Open def- opened 1 year ago

def- commented 1 year ago

Jira Link: DB-5120

Description

Sometimes seen during upgrade tests:

testupgrade-aws-rf3-upgrade-2.13.2.0_135: Start
    (     0.551s) User Login : Success
    (     0.141s) Refresh YB Version : Success
    (    90.429s) Setup Provider : Success
    (     0.074s) Updating Health Check Interval to 60000 sec : Success
    (   270.762s) Create universe mkan-isd2978-1fb7caac3b-20230119-073500 : Success
    (    40.174s) Start sample workloads : Success
    (   700.441s) Creating V2 backups in S3 : Success
    (   279.916s) Upgrade Software to 2.16.1.0-b34 : Success
    (   860.991s) Executing yb-admin upgrade_ysql : Success
    (     0.001s) Executing yb-admin upgrade_ysql : >>> Integration Test Failed <<< 
Verifying during upgrade failed:
Query error: The catalog snapshot used for this transaction has been invalidated: MISMATCHED_SCHEMA
CONTEXT:  Catalog Version Mismatch: A DDL occurred while processing this query. Try again.

    (    19.288s) Saved server log files and keys at /share/jenkins/workspace/itest-system-developer/logs/2.16.1.0_testupgrade-aws-rf3-upgrade-2.13.2.0_135_20230119_081809 : Success
    (   125.286s) Destroy universe : Success
    (     0.242s) Check and stop workloads : Success
testupgrade-aws-rf3-upgrade-2.13.2.0_135: End

There are no DDLs running during upgrade, the verify thread only runs SELECT queries. my guess is that the upgrade_ysql is running a DDL and thus causing this failure during upgrade.

def- commented 1 year ago

Also happened on master from testupgrade-k8s-rf3-upgrade-2.12.10.0_41: Start

This looks quite sporadic though, only happened once on master, once on 2.16. I think this might not be a regression, but just chance that we are running into it now.

frozenspider commented 1 year ago

Interesting part:

ERR=Error running upgrade_ysql: Network error (yb/yql/pgwrapper/libpq_utils.cc:284): Unable to upgrade YSQL cluster:
Fetch 'SELECT major, minor FROM pg_catalog.pg_yb_migration  ORDER BY major DESC, minor DESC  LIMIT 1' failed: 7,
message: ERROR:  Query error: The catalog snapshot used for this transaction has been invalidated: MISMATCHED_SCHEMA (pgsql error XX000)
(aux msg ERROR:  Query error: The catalog snapshot used for this transaction has been invalidated: MISMATCHED_SCHEMA)

So this means upgrade's own query fails.

def- commented 1 year ago

My bad, I posted logs from a run where this failure was ignored, as it now is for 5 consecutive tried in upgrade_ysql.

The failure during another query (SELECT ename FROM xcluster_with_bootstrap_cdc.employees where id='user-id-1-1';) had this log:

2023-01-06 07:56:33,349 ssh_utils.py:300 INFO testupgrade-k8s-rf3-upgrade-2.15.3.2_1 CMD=/home/yugabyte/master/bin/yb-admin --timeout_ms 120000 --master_addresses 10.9.202.227:7100,10.9.135.122:7100,10.9.97.111:7100 upgrade_ysql, RETCODE=0, OUT=YSQL successfully upgraded to the latest version
, ERR=
INFO:root:OUT_STR:=YSQL successfully upgraded to the latest version

2023-01-06 07:56:33,446 test_universe_base.py:542 INFO testupgrade-k8s-rf3-upgrade-2.15.3.2_1 OUT_STR:=YSQL successfully upgraded to the latest version

INFO:root:Connection closed to <connection object at 0x7f542823f690; dsn: 'host=10.9.135.122 dbname=mv_upgrade_db2_15_3_2 user=yugabyte port=5433', closed: 1>
2023-01-06 07:56:33,446 database_operations.py:41 INFO verify_during_upgrade Connection closed to <connection object at 0x7f542823f690; dsn: 'host=10.9.135.122 dbname=mv_upgrade_db2_15_3_2 user=yugabyte port=5433', closed: 1>
INFO:root:ERR_STR:=
2023-01-06 07:56:33,452 test_universe_base.py:543 INFO testupgrade-k8s-rf3-upgrade-2.15.3.2_1 ERR_STR:=
ERROR:root:ITEST FAILED testupgrade-k8s-rf3-upgrade-2.15.3.2_1 : AssertionError('Verifying during upgrade failed:\nQuery error: The catalog snapshot used for this transaction has been invalidated: MISMATCHED_SCHEMA\nCONTEXT:  Catalog Version Mismatch: A DDL occurred while processing this query. Try again.\n')
2023-01-06 07:56:33,510 test_base.py:170 ERROR testupgrade-k8s-rf3-upgrade-2.15.3.2_1 ITEST FAILED testupgrade-k8s-rf3-upgrade-2.15.3.2_1 : AssertionError('Verifying during upgrade failed:\nQuery error: The catalog snapshot used for this transaction has been invalidated: MISMATCHED_SCHEMA\nCONTEXT:  Catalog Version Mismatch: A DDL occurred while processing this query. Try again.\n')
INFO:root:Select query: SELECT ename FROM xcluster_with_bootstrap_cdc.employees where id='user-id-1-1';
2023-01-06 07:56:33,511 database_operations.py:433 INFO testycqlxclusterwithbootstrapcdctransactionaltable-aws-rf3 Select query: SELECT ename FROM xcluster_with_bootstrap_cdc.employees where id='user-id-1-1';
kripasreenivasan commented 3 months ago

We observe the error Catalog Version Mismatch: A DDL occurred while processing this query. Try again.

kripasreenivasan commented 3 months ago

@massifcoder-yb , There was a test fix that went in a week back. Vishal will monitor the latest runs and update this ticket with the most recent results in a week's time.