yugabyte / yugabyte-db

YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
https://www.yugabyte.com
Other
8.94k stars 1.06k forks source link

[YSQL][LST] ERROR: Illegal state: Used read time is not set #12464

Open def- opened 2 years ago

def- commented 2 years ago

Jira Link: [DB-288]

Description

This failure happened using the Long-running System Test framework I'm currently working on. Yugabyte-DB is on code state 817b50bd83a8590474b86741b8592504dc6edd7c, Release build on Almalinux 8:

$ git checkout 32a7e77 && python3.9 ./long_system_test.py --runtime=0 --threads=10 --seed=773252 --complexity=full
[...]
2022-05-11 13:14:24,664 worker_3   ERROR    Unexpected query failure: InternalError_
Query: INSERT INTO tg1_3 (c0_jsonb, c1_int8range, c2_int) VALUES ('{"a": 2, "b": ["0", "1", "2", "3", "4", "5", "6", "7"], "c": true}'::jsonb, '(,49)'::INT8RANGE, -77);
  values: None
  runtime: 2022-05-11 13:14:24.661 - 2022-05-11 13:14:24.664
  supports explain: True
  supports rollback: True
  affected rows: None
Action: SingleInsertAction
Error class: InternalError_
Error code: XX000
Error message: ERROR:  Illegal state: Used read time is not set
Transaction isolation level: committed
DB Node: host: localhost, port: 5433
DB Backend PID: 61852

lst_2022-05-11_13:12:54_773252.zip Can be reproduced.

def- commented 2 years ago

There is also a follow-up failure that sometimes happens, can be reproduced on 1167626:

2022-05-11 13:23:55,539 worker_6   ERROR    Unexpected query failure: OperationalError
Query: UPDATE tg1_0 SET c1_numrange = '[-19.54874624280353,56.81990367222838)'::NUMRANGE WHERE FALSE;
  values: None
  runtime: 2022-05-11 13:23:54.392 - 2022-05-11 13:23:55.538
  supports explain: True
  supports rollback: True
  affected rows: 0
Action: SingleUpdateAction
Error class: OperationalError
Error code: None
Error message: ERROR:  Illegal state: Used read time is not set
ERROR:  Internal error: Attempted to rollback to savepoint before creating any savepoints.
ERROR:  Internal error: Attempted to rollback to savepoint before creating any savepoints.
ERROR:  Internal error: Attempted to rollback to savepoint before creating any savepoints.
ERROR:  Internal error: Attempted to rollback to savepoint before creating any savepoints.
PANIC:  ERRORDATA_STACK_SIZE exceeded
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
Transaction isolation level: repeatable read
DB Node: host: localhost, port: 5433
DB Backend PID: 65606
def- commented 2 years ago

This appears to be caused by SET CONFIG yb_debug_report_error_stacktrace TO on; Edit: Apparently not, still happens without it.

def- commented 2 years ago

I found an easy way to reproduce this (and the culprit config values, code state eeb799c354598d7ca7e2d09ff3547b4f85cdc5b1):

yugabyte=# create table collation_test_table_1(id text collate "POSIX", name text collate "POSIX", age int, PRIMARY KEY(id asc));
CREATE TABLE
yugabyte=# set yb_disable_transactional_writes=on;
set ysql_session_max_batch_size=1;
SET
SET
yugabyte=# insert into collation_test_table_1 values ('A','a', 22), ('a','A',50), ('z','z',40), ('Z','Z',30);
ERROR:  Illegal state: Used read time is not set

https://phabricator.dev.yugabyte.com/D13703 in combination with https://github.com/yugabyte/yugabyte-db/issues/11628 @emhna @nathanhjli If this is not expected to work in combination, maybe we should at least have a friendlier error message.

This is also sometimes failing for me during COPY commands:

2022-05-30 21:47:35.505 UTC [253] ERROR:  Illegal state: Used read time is not set
2022-05-30 21:47:35.505 UTC [253] STATEMENT:  COPY collation_test_table_1 FROM '/tmp/data.csv' DELIMITER ',' CSV HEADER

I'm running it like this on build 2.13.3.0-b118:

+    session.execute("set yb_disable_transactional_writes=on")
+    session.execute("set ysql_session_max_batch_size=1")
+    session.execute("set ysql_max_in_flight_ops=1000000")
+    ysql_copy_to(session, "/tmp/data.csv", table)
+    ysql_copy_from(session, "/tmp/data.csv", table)
+    session.execute("select * from pg_stat_progress_copy")
+    session.execute("set yb_disable_transactional_writes=off")
+    session.execute("set ysql_session_max_batch_size=0")  # use gflag default
+    session.execute("set ysql_max_in_flight_ops=10000")

Note that in LST I did not have ysql_session_max_batch_size and ysql_max_in_flight_ops set, but only yb_disable_transactional_writes. So this can still fail just by disabling transactional writes, but easier to reproduce with smaller batch sizes.

lnguyen-yugabyte commented 2 years ago

Specific failure raised: https://github.com/yugabyte/yugabyte-db/blob/master/src/yb/tserver/pg_client_session.cc#L1042

Status PgClientSession::CheckPlainSessionReadTime() {
  auto session = Session(PgClientSessionKind::kPlain);
  if (!session->read_point()->GetReadTime()) {
    ReadHybridTime used_read_time;
    {
      std::lock_guard<simple_spinlock> guard(plain_session_used_read_time_.lock);
      used_read_time = plain_session_used_read_time_.value;
    }
    >>>>> RSTATUS_DCHECK(used_read_time, IllegalState, "Used read time is not set");  // Failure here
    session->SetReadPoint(used_read_time);
    VLOG_WITH_PREFIX(3)
        << "Update read time from used read time: " << session->read_point()->GetReadTime();
  }
  return Status::OK();
}
tedyu commented 2 years ago
+    session.execute("select * from pg_stat_progress_copy")

COPY progress hasn't been checked in yet. I wonder where the above came from.

def- commented 2 years ago

I noticed that as well and removed that again. Comes from https://phabricator.dev.yugabyte.com/D17260

def- commented 2 years ago

@pkj415 Will check if this is fixed and reopen if it happens again.

def- commented 2 years ago

Still seems to happend @pkj415

def- commented 2 years ago

This is easily reproducible in LST runs during upgrade: https://phabricator.dev.yugabyte.com/D17456

2022-09-05 20:50:57,610 worker_3   ERROR    Unexpected query failure: InternalError_
Query: UPDATE tg0_3 SET c3_text = '6' WHERE TRUE;
  values: None
  runtime: 2022-09-05 20:50:57.592 - 2022-09-05 20:50:57.609
  supports explain: True
  supports rollback: True
  affected rows: None
Action: SingleUpdateAction
Error class: InternalError_
Error code: XX000
Error message: ERROR:  Illegal state: Used read time is not set
Transaction isolation level: repeatable read
DB Node: host: 10.9.128.138, port: 5433
DB Backend PID: 18074

Happened here in upgrade from 2.13.2.0-b135 to 2.15.3.0-b133. Note that it happens here even without yb_disable_transactional_writes.

kmuthukk commented 1 year ago

Possibly related to #16446 ?