Closed bllewell closed 4 years ago
Very nice and thorough description @bllewell ! Crash stacktrace I got when trying case number 3:
TRAP: FailedAssertion("!(cacheId >= 0 && cacheId < (USERMAPPINGUSERSERVER + 1) && ((const void*)(SysCache[cacheId]) != ((void*)0)))", File: "../../../../../../../src/postgres/src/backend/utils/cache/syscache.c", Line: 1463)
*** Aborted at 1587722423 (unix time) try "date -d @1587722423" if you are using GNU date ***
PC: @ 0x7fff5eda32c6 __pthread_kill
*** SIGABRT (@0x7fff5eda32c6) received by PID 5508 (TID 0x7000041f3000) stack trace: ***
@ 0x7fff5ee4db5d _sigtramp
@ 0xffffffff (unknown)
@ 0x7fff5ed0d6a6 abort
@ 0x10f117a0e ExceptionalCondition
@ 0x10f112d17 SearchSysCache1
@ 0x10f0488bd format_type_extended
@ 0x10f02a6f0 array_cat
@ 0x10f166347 evalExpr
@ 0x10f16619d YbgEvalExpr
@ 0x10e21d3bc yb::docdb::DocPgEvalExpr()
@ 0x10e251737 yb::docdb::DocExprExecutor::EvalTSCall()
@ 0x1141f4d3c yb::QLExprExecutor::EvalExpr()
@ 0x10e251a9d yb::QLExprExecutor::EvalExpr()
@ 0x10e298d01 yb::docdb::PgsqlWriteOperation::ApplyUpdate()
@ 0x10e298023 yb::docdb::PgsqlWriteOperation::Apply()
@ 0x10e204199 yb::docdb::ExecuteDocWriteOperation()
@ 0x10cd2c8d8 yb::tablet::Tablet::StartDocWriteOperation()
@ 0x10cd313c6 yb::tablet::Tablet::KeyValueBatchFromPgsqlWriteBatch()
@ 0x10cd31bac yb::tablet::Tablet::AcquireLocksAndPerformDocOperations()
@ 0x10cda70ab yb::tablet::TabletPeer::WriteAsync()
@ 0x10bf4da2e yb::tserver::TabletServiceImpl::Write()
@ 0x1104c2994 yb::tserver::TabletServerServiceIf::Handle()
@ 0x1138b23f3 yb::rpc::ServicePoolImpl::Handle()
@ 0x11380a6a9 yb::rpc::InboundCall::InboundCallTask::Run()
@ 0x1138cdeff yb::rpc::(anonymous namespace)::Worker::Execute()
@ 0x114f3a783 yb::Thread::SuperviseThread()
@ 0x7fff5ee562eb _pthread_body
@ 0x7fff5ee59249 _pthread_start
@ 0x7fff5ee5540d thread_start
As such, this is likely a bug in expression pushdown framework (#3297)
Here is a trivial workaround for the || operator. It capitalizes on what "Testcase No, 1" shows.
drop table if exists t cascade;
create table t(k int primary key, arr int[]);
insert into t(k, arr)
values (1, '{3, 4, 5}'::int[]);
select arr as "old value of arr" from t where k = 1;
with v as (
select '{1, 2}'::int[]||arr||6::int as new_arr from t where k = 1)
update t
set arr = (select new_arr from v)
where k = 1;
select arr as "new value of arr" from t where k = 1;
Notice that table t
does have a primary key constraint. I ran this ten times in rapid succession. It produced the expected result, first this:
old value of arr
------------------
{3,4,5}
and then this:
new value of arr
------------------
{1,2,3,4,5,6}
every time.
The same workaround is effective for array_remove()
:
create table t(k int primary key, arr int[]);
insert into t(k, arr)
values (1, '{1, 2, 2, 2, 5, 6}'::int[]);
select arr as "old value of arr" from t where k = 1;
with v as (
select array_remove(arr, 2) as new_arr from t where k = 1)
update t
set arr = (select new_arr from v)
where k = 1;
select arr as "new value of arr" from t where k = 1;
Notice that, here too, table t
does have a primary key constraint. I ran this ten times in rapid succession. It produced the expected result, first this:
old value of arr
------------------
{1,2,2,2,5,6}
and then this:
new value of arr
------------------
{1,5,6}
every time.
It works, too, for array_replace
:
drop table if exists t cascade;
drop type if exists rt cascade;
create type rt as (f1 int, f2 text);
create table t(k int primary key, arr rt[]);
insert into t(k, arr)
values (1, '{"(1,rabbit)","(2,hare)","(3,squirrel)","(4,horse)"}'::rt[]);
select arr as "old value of arr" from t where k = 1;
with v as (
select array_replace(arr, '(3,squirrel)', '(3,bobcat)')
as new_arr from t where k = 1)
update t
set arr = (select new_arr from v)
where k = 1;
select arr as "new value of arr" from t where k = 1;
Notice that, here too, table t
does have a primary key constraint. I ran this ten times in rapid succession. It produced the expected result, first this:
old value of arr
------------------------------------------------------
{"(1,rabbit)","(2,hare)","(3,squirrel)","(4,horse)"}
and then this:
new value of arr
----------------------------------------------------
{"(1,rabbit)","(2,hare)","(3,bobcat)","(4,horse)"}
every time.
@bllewell Should be fixed by 7ec4a77. I tested all examples above and they all work now.
Added 15-May-2020.
I ran each of the testcases in Version 2.1.6.0, MacOS, (downloaded from the Internet). Each ran without error.
Using YB version 2.1.4.0 on MacOS Mojave Version 10.14.6; RF=1, single node.
Summary
When each of these:
is used like this:
then errors like this occur:
Following this, the server has crashed. We need to do
yb-ctl restart
before normal service is resumed. All sorts of other errors are possible, as shown in each of the following specific testcases. Not all of them require arestart
to recover.Each of the testcases runs without error on vanilla PG (Version 11.2).
In all cases, it's enough to declare
table t
without a primary key to make the error disappear. And, when this is done, the outcomes are identical to those that the same scripts produce on vanilla PG.Testcase no. 1: array_replace() with array of row type values
These tests run withour error:
But this test consistently errors:
It produces an error like this:
And now
yb-ctl status
shows this:Testcase no. 2: array_replace() with text[] array
These tests run withour error:
But this test consistently errors.
Sometimes, the
update
reports this nonsense error:And sometimes the
update
goes silenty and theselect
causes this:Or sometimes
for type 0
.And sometimes the
select
causes this:Note: Try with this:
Here, value of
arr
together with the value of the second "replace with" parameter mean thatarray_replace()
actually have cannot have an effect. But still we see errors.Testcase no. 3: the || operator with int[] array
This test consistently fails:
It cases an error like this:
requiring a
restart
.Testcase no. 4: array_cat() with int[] array
This test consistently fails:
It cases an error like this:
requiring a
restart
.Testcase no. 5: array_append() with int[] array
This test consistently fails:
It cases an error like this:
requiring a
restart
.Testcase no. 6: array_prepend() with int[] array
This test consistently fails:
It cases an error like this:
requiring a
restart
.Once or twice, I've seen this different error immediately following the
update
attempt:Testcase no. 7: array_remove() with int[] array
This test fails with different outcomes:
Usually, I see this:
But, once or twice, I've seen this:
This doesn't crash the server.