cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.15k stars 3.81k forks source link

roachtest: schemachange/random-load failed #81094

Closed cockroach-teamcity closed 2 years ago

cockroach-teamcity commented 2 years ago

roachtest.schemachange/random-load failed with artifacts on master @ 98bdf3241028c9b1bdff429fb455e61870adc9d0:

          |  "workerId": 0,
          |  "clientTimestamp": "11:59:55.748064",
          |  "ops": [
          |   "BEGIN",
          |   "ALTER DATABASE schemachange SURVIVE REGION FAILURE"
          |  ],
          |  "expectedExecErrors": "22023,42602",
          |  "expectedCommitErrors": "",
          |  "message": "ROLLBACK; Successfully got expected execution error. Dumping state before death:\nExpected errors: 22023,42602===========================Executed queries for generating errors: QUERY [SELECT region FROM [SHOW REGIONS FROM DATABASE]] : \nQUERY [SHOW DATABASE] :schemachange\n===========================Previous statements [ALTER DATABASE schemachange SURVIVE REGION FAILURE]: ERROR: database must have associated regions before a survival goal can be set (SQLSTATE 42602)"
          | }
          | {
          |  "workerId": 0,
          |  "clientTimestamp": "11:59:55.778202",
          |  "ops": [
          |   "BEGIN",
          |   "ALTER TABLE schema921.table1029 SET LOCALITY REGIONAL BY ROW"
          |  ],
          |  "expectedExecErrors": "42P01",
          |  "expectedCommitErrors": "",
          |  "message": "ROLLBACK; Successfully got expected execution error. Dumping state before death:\nExpected errors: 42P01===========================Executed queries for generating errors: QUERY [\"SELECT EXISTS (\\n\\tSELECT table_name\\n    FROM information_schema.tables \\n   WHERE table_schema = $1\\n     AND table_name = $2\\n   )\" [\"schema921\" \"table1029\"]] :false\n===========================Previous statements [ALTER TABLE schema921.table1029 SET LOCALITY REGIONAL BY ROW]: ERROR: relation \"schema921.table1029\" does not exist (SQLSTATE 42P01)"
          | }
          | {
          |  "workerId": 0,
          |  "clientTimestamp": "11:59:55.797545",
          |  "ops": [
          |   "BEGIN",
          |   "ALTER DATABASE schemachange ADD REGION \"europe-west2\""
          |  ],
          |  "expectedExecErrors": "42P12",
          |  "expectedCommitErrors": "",
          |  "message": "ROLLBACK; Successfully got expected execution error. Dumping state before death:\nExpected errors: 42P12===========================Executed queries for generating errors: QUERY [SELECT region FROM [SHOW REGIONS FROM CLUSTER]] : us-east1,us-west1,europe-west2,\nQUERY [SELECT region FROM [SHOW REGIONS FROM DATABASE]] : \nQUERY [SHOW DATABASE] :schemachange\n===========================Previous statements [ALTER DATABASE schemachange ADD REGION \"europe-west2\"]: ERROR: cannot add region \"europe-west2\" to database schemachange (SQLSTATE 42P12)"
          | }
          | {
          |  "workerId": 0,
          |  "clientTimestamp": "11:59:54.887021",
          |  "ops": [
          |   "BEGIN",
          |   "ALTER DATABASE schemachange SURVIVE REGION FAILURE"
          |  ],
          |  "expectedExecErrors": "22023,42602",
          |  "expectedCommitErrors": "",
          |  "message": "ROLLBACK; Successfully got expected execution error. Dumping state before death:\nExpected errors: 22023,42602===========================Executed queries for generating errors: QUERY [SELECT region FROM [SHOW REGIONS FROM DATABASE]] : \nQUERY [SHOW DATABASE] :schemachange\n===========================Previous statements [ALTER DATABASE schemachange SURVIVE REGION FAILURE]: ERROR: database must have associated regions before a survival goal can be set (SQLSTATE 42602)"
          | }
        Wraps: (4) COMMAND_PROBLEM
        Wraps: (5) Node 1. Command with error:
          | ``````
          | ./workload run schemachange --verbose=1 --tolerate-errors=false  --histograms=perf/stats.json --max-ops 5000 --concurrency 20 --txn-log /mnt/data1/cockroach/transactions.json
          | ``````
        Wraps: (6) exit status 1
        Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *cluster.WithCommandDetails (4) errors.Cmd (5) *hintdetail.withDetail (6) *exec.ExitError
Help

See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)

Same failure on other branches

- #70016 roachtest: schemachange/random-load failed [C-test-failure O-roachtest O-robot T-sql-schema branch-release-21.2] - #61698 roachtest: schemachange/random-load failed [C-test-failure O-roachtest O-robot T-sql-schema branch-release-21.1]

/cc @cockroachdb/sql-schema

This test on roachdash | Improve this report!

Jira issue: CRDB-15149

cockroach-teamcity commented 2 years ago

roachtest.schemachange/random-load failed with artifacts on master @ bc1ee7c7c276984fce8ff5ba4fcfcdff335dde50:

          | QUERY ["SELECT (((col2061_2066, col2061_2063))::STRING) IS NULL AS c FROM ( VALUES(0.10249499732686213:::FLOAT8,ARRAY[]:::FLOAT8[],ARRAY[3955667066:::OID,3583998912:::OID,3246013206:::OID,3929745433:::OID],e'\\U00002603':::STRING)) AS t(col2061_2063,col2061_2064,col2061_2065,col2061_2066);" []] :false
          | QUERY ["SELECT array[(CASE WHEN col2061_2064 IS NULL THEN '':::STRING ELSE e'\\U00002603':::STRING END)::STRING::STRING] AS c FROM ( VALUES(0.10249499732686213:::FLOAT8,ARRAY[]:::FLOAT8[],ARRAY[3955667066:::OID,3583998912:::OID,3246013206:::OID,3929745433:::OID])) AS t(col2061_2063,col2061_2064,col2061_2065);" []] :[["☃"]]
          | QUERY ["SELECT (((col2061_2063))::STRING) IS NULL AS c FROM ( VALUES(0.10249499732686213:::FLOAT8,ARRAY[]:::FLOAT8[],ARRAY[3955667066:::OID,3583998912:::OID,3246013206:::OID,3929745433:::OID],e'\\U00002603':::STRING)) AS t(col2061_2063,col2061_2064,col2061_2065,col2061_2066);" []] :false
          | QUERY ["\n    WITH tab_json AS (\n                    SELECT crdb_internal.pb_to_json(\n                            'desc',\n                            descriptor\n                           )->'table' AS t\n                      FROM system.descriptor\n                     WHERE id = $1::REGCLASS\n                  ),\n         columns_json AS (\n                        SELECT json_array_elements(t->'columns') AS c FROM tab_json\n                      ),\n         columns AS (\n                    SELECT (c->>'id')::INT8 AS col_id,\n                           IF(\n                            (c->'inaccessible')::BOOL,\n                            c->>'computeExpr',\n                            c->>'name'\n                           ) AS expr\n                      FROM columns_json\n                 ),\n         indexes_json AS (\n                         SELECT json_array_elements(t->'indexes') AS idx\n                           FROM tab_json\n                         UNION ALL SELECT t->'primaryIndex' FROM tab_json\n                      ),\n         unique_indexes AS (\n                            SELECT idx->'name' AS name,\n                                   json_array_elements(\n                                    idx->'keyColumnIds'\n                                   )::STRING::INT8 AS col_id\n                              FROM indexes_json\n                             WHERE (idx->'unique')::BOOL\n                        ),\n         index_exprs AS (\n                        SELECT name, expr\n                          FROM unique_indexes AS idx\n                               INNER JOIN columns AS c ON idx.col_id = c.col_id\n                     )\n  SELECT ARRAY['(' || array_to_string(array_agg(expr), ', ') || ')'] AS final_expr\n    FROM index_exprs\n   WHERE expr != 'rowid'\nGROUP BY name;\n" ["schema1165.table2061"]] :[["(col2061_2063)"] ["(col2061_2063)"]]
          | QUERY ["SELECT array[(CASE WHEN col2061_2064 IS NULL THEN '':::STRING ELSE e'\\U00002603':::STRING END)::STRING::STRING] AS c FROM ( VALUES(0.10249499732686213:::FLOAT8,ARRAY[]:::FLOAT8[],ARRAY[3955667066:::OID,3583998912:::OID,3246013206:::OID,3929745433:::OID])) AS t(col2061_2063,col2061_2064,col2061_2065);" []] :[["☃"]]: scanBool: "SELECT EXISTS ( SELECT * FROM schema1165.table2061 WHERE (col2061_2063)= ( SELECT  (col2061_2063) FROM (VALUES( 0.10249499732686213:::FLOAT8) ) AS T(col2061_2063) ) )" []: ERROR: internal error: in-between filters didn't yield a constraint (SQLSTATE XX000)
          |
          | stdout:
          | <... some data truncated by circular buffer; go to artifacts for details ...>
          | mestamp": "12:07:05.424975",
          |  "ops": [
          |   "BEGIN",
          |   "CREATE TABLE schema2219.table2364 AS SELECT schema1639.table1807.col542_545, schema1165.table2041.col360_377 FROM schema1639.table1807, schema1165.table2041",
          |   "ALTER DATABASE schemachange ADD REGION \"europe-west2\""
          |  ],
          |  "expectedExecErrors": "42P12",
          |  "expectedCommitErrors": "",
          |  "message": "ROLLBACK; Successfully got expected execution error. Dumping state before death:\nExpected errors: 42P12===========================Executed queries for generating errors: QUERY [SELECT region FROM [SHOW REGIONS FROM CLUSTER]] : europe-west2,us-east1,us-west1,\nQUERY [SELECT region FROM [SHOW REGIONS FROM DATABASE]] : \nQUERY [SHOW DATABASE] :schemachange\n===========================Previous statements [CREATE TABLE schema2219.table2364 AS SELECT schema1639.table1807.col542_545, schema1165.table2041.col360_377 FROM schema1639.table1807, schema1165.table2041 ALTER DATABASE schemachange ADD REGION \"europe-west2\"]: ERROR: cannot add region \"europe-west2\" to database schemachange (SQLSTATE 42P12)"
          | }
          | {
          |  "workerId": 0,
          |  "clientTimestamp": "12:07:06.371935",
          |  "ops": [
          |   "BEGIN",
          |   "CREATE TABLE schema2254.table2332 (col2332_2336 FLOAT4, col2332_2337 NAME, col2332_2338 REGPROC NULL, col2332_2339 TIME NOT NULL, col2332_2340 REGCLASS NULL, col2332_2341 JSONB, col2332_2342 STRING NULL, col2332_2343 REGPROCEDURE NOT NULL, col2332_2344 REGCLASS, col2332_2345 VARCHAR NOT NULL, col2332_2346 REGTYPE NOT NULL, col2332_2347 REGTYPE NULL, col2332_2348 FLOAT4 AS (col2332_2336 + (-0.2431679666042328):::FLOAT8) STORED, col2332_2349 STRING NOT NULL AS (lower(col2332_2345)) STORED, col2332_2350 STRING NULL AS (CASE WHEN col2332_2338 IS NULL THEN e'I\\x02h-':::STRING ELSE e'7_c\\x1b\\x1a0IH':::STRING END) STORED, col2332_2351 STRING AS (lower(CAST(col2332_2341 AS STRING))) STORED, col2332_2352 STRING NOT NULL AS (CASE WHEN col2332_2344 IS NULL THEN NULL ELSE NULL END) VIRTUAL, PRIMARY KEY (col2332_2339 DESC, col2332_2352 ASC), UNIQUE (col2332_2344))",
          |   "DROP TABLE public.table542 CASCADE",
          |   "COMMIT"
          |  ],
          |  "expectedExecErrors": "",
          |  "expectedCommitErrors": "",
          |  "message": "TXN RETRY ERROR; ERROR: restart transaction: TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE - failed preemptive refresh due to a conflict: intent on key /Table/3/1/832/2/1): \"sql txn\" meta={id=357963ba key=/NamespaceTable/30/1/104/799/\"table2332\"/4/1 pri=0.00725053 epo=0 ts=1652357227.745103811,2 min=1652357226.371880018,0 seq=21} lock=true stat=PENDING rts=1652357226.582554822,2 wto=false gul=1652357226.871880018,0 (SQLSTATE 40001)"
          | }
          | {
          |  "workerId": 0,
          |  "clientTimestamp": "12:07:06.582662",
          |  "ops": [
          |   "BEGIN",
          |   "SELECT 'validating all objects', crdb_internal.validate_multi_region_zone_configs()",
          |   "CREATE TABLE schema1165.table2061 AS SELECT public.table1540.col152_165, public.table1540.col152_161, public.table1540.col152_170 FROM public.table1540"
          |  ],
          |  "expectedExecErrors": "42P07",
          |  "expectedCommitErrors": "",
          |  "message": "ROLLBACK; Successfully got expected execution error. Dumping state before death:\nExpected errors: 42P07===========================Executed queries for generating errors: QUERY [\"SELECT EXISTS (\\n\\tSELECT table_name\\n    FROM information_schema.tables \\n   WHERE table_schema = $1\\n     AND table_name = $2\\n   )\" [\"public\" \"table1540\"]] :true\nQUERY [\"SELECT EXISTS (\\n\\tSELECT schema_name\\n\\t\\tFROM information_schema.schemata\\n   WHERE schema_name = $1\\n\\t)\" [\"schema1165\"]] :true\nQUERY [\"SELECT EXISTS (\\n\\tSELECT table_name\\n    FROM information_schema.tables \\n   WHERE table_schema = $1\\n     AND table_name = $2\\n   )\" [\"schema1165\" \"table2061\"]] :true\n===========================Previous statements [SELECT 'validating all objects', crdb_internal.validate_multi_region_zone_configs() CREATE TABLE schema1165.table2061 AS SELECT public.table1540.col152_165, public.table1540.col152_161, public.table1540.col152_170 FROM public.table1540]: ERROR: relation \"schemachange.schema1165.table2061\" already exists (SQLSTATE 42P07)"
          | }
        Wraps: (4) COMMAND_PROBLEM
        Wraps: (5) Node 1. Command with error:
          | ``````
          | ./workload run schemachange --verbose=1 --tolerate-errors=false  --histograms=perf/stats.json --max-ops 5000 --concurrency 20 --txn-log /mnt/data1/cockroach/transactions.json
          | ``````
        Wraps: (6) exit status 1
        Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *cluster.WithCommandDetails (4) errors.Cmd (5) *hintdetail.withDetail (6) *exec.ExitError
Help

See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7)

Same failure on other branches

- #70016 roachtest: schemachange/random-load failed [C-test-failure O-roachtest O-robot T-sql-schema branch-release-21.2] - #61698 roachtest: schemachange/random-load failed [C-test-failure O-roachtest O-robot T-sql-schema branch-release-21.1]

This test on roachdash | Improve this report!

ajwerner commented 2 years ago

Closing in hopes that this is sufficiently rare and #80820, which is the cause of both failures is fixed. If not, and it pops back up, we'll do something like https://github.com/cockroachdb/cockroach/pull/81291