Open nocaway opened 3 years ago
All of the tests that are described in these comments:
and that are implemented in the attached .zip files were run after starting my YB cluster with this YSQL configuration parameter setting:
ysql_suppress_unsupported_error=true
The default for this parameter is false. And when the default setting is used, every 0A000 occurrence, that in my tests is a warning and that I suppress by setting _client_minmessages to error, is a genuine error—meaning that any statement that causes 0A000 rolls back and has no effect.
NOTE: the list that starts this report is wrong. Here are the statements that don't draw a warning (when _client_minmessages is set to warning):
fetch from cur;
fetch next from cur;
fetch forward from cur;
fetch :N from cur;
fetch forward :N from cur;
where :N is an integer that's greater than zero.
This issue needs some testcases. In particular, it doesn't mention limitations in PL/pgSQL cursor functionality that stem from the underlying SQL functionality limitations.
I just tested everything using YB-2.17.0.0. My tests are attached.
It actually turns out that every operation that's listed above as "not yet supported" runs quietly without error and produces the expected result as long as you do this:
set client_min_messages = error;
This suggests that, with another round of careful testing, this warning, and others like it for other variants of fetch, can simply be removed:
0A000: FETCH BACKWARD not supported yet
Similarly, move works fine when warnings are suppressed:
set client_min_messages = error;
create function s.f(n in int)
returns int
set search_path = pg_catalog, pg_temp
language plpgsql
as $body$
declare
cur refcursor := 'cur';
vv int;
begin
open cur for select v from s.t order by v;
move absolute n in cur;
fetch cur into vv;
return vv;
end;
$body$;
select s.f(4);
Notice that, when _client_minmessages is warning, even create function when the body has a move statement draws the warning:
0A000: MOVE not supported yet
You get it again at runtime. But you also get the right result.
The following shows the testcase concept. However, the real testcase (attached) uses a fine-grained control over _client_minmessages to set it to error only when this is needed.
Connect as an ordinary user to a database upon which it has the create privilege. First create and populate a suitable test table.
-- Notice that the entire testcase runs with this setting.
set client_min_messages = error;
drop schema if exists s cascade;
create schema s;
create table s.t(k serial primary key, v int not null);
insert into s.t(v) select g.v from generate_series(5, 25) as g(v);
Next, create a procedure that declares a session-duration cursor. (There's no explicit PL/pgSQL statements that allow this. But dynamic SQL manages it without a problem.) The advantage of using a session-duration cursor for these tests is simply that you can create it once and forget about it without having to do everything within an explicitly transaction. (An an explicitly transaction makes ad hoc queries tedious because even a trivial typo breaks the txn so that you have to start again from scratch.)
create procedure s.declare_cur()
set search_path = pg_catalog, pg_temp
security definer
language plpgsql
as $body$
begin
begin
execute 'close cur;';
exception
when invalid_cursor_name then null;
end;
execute
'declare cur scroll cursor with hold '||
'for select v::text from s.t where v between 11 and 20 order by v;';
end;
Next, create a function that encapsulates the various fetch operations, using (according to the lang formal argument) either raw SQL (via the execute statement) or the dedicated PL/pgSQL fetch statement variants.
create function s.fetched_row(lang in text, kind in text, cur in refcursor)
returns text
set search_path = pg_catalog, pg_temp
security definer
language plpgsql
as $body$
declare
v text;
begin
case lang
when 'sql' then
execute format('fetch %s from cur', kind) into v;
when 'plpgsql' then
case kind
when 'first' then
fetch first from cur into v;
when 'next' then
fetch next from cur into v;
when 'absolute 5' then
fetch absolute 5 from cur into v;
when 'relative 3' then
fetch relative 3 from cur into v;
when 'relative 0' then
fetch relative 0 from cur into v;
when 'relative -5' then
fetch relative -5 from cur into v;
when 'last' then
fetch last from cur into v;
when 'prior' then
fetch prior from cur into v;
end case;
end case;
return case
when v is null then null
else rpad(kind, 13)||': '||v
end;
end;
$body$;
Next, create a function that encapsulates the various PL/pgSQL move statement variants, matching the variants one-to-one with the fetch variants.
create function s.move_report(kind in text, cur in refcursor)
returns text
set search_path = pg_catalog, pg_temp
security definer
language plpgsql
as $body$
begin
case kind
when 'first' then
move first in cur;
when 'next' then
move next in cur;
when 'absolute 5' then
move absolute 5 in cur;
when 'relative 3' then
move relative 3 in cur;
when 'relative 0' then
move relative 0 in cur;
when 'relative -5' then
move relative -5 in cur;
when 'last' then
move last in cur;
when 'prior' then
move prior in cur;
end case;
return kind;
end;
$body$;
Finally, create a driver table function that exercises all the variants according to its lang formal argument. When lang is sql, it exercises all of the SQL fetch variants but, of course, there's no "move". When lang is plpgsql, it exercises all of the PL/pgSQL fetch variants and all of the PL/pgSQL move variants.
create function s.cur_rows_sql(lang in text, cur in refcursor)
returns table(z text)
set search_path = pg_catalog, pg_temp
security definer
language plpgsql
as $body$
declare
v text;
kind_first constant text not null := 'first';
kind_next constant text not null := 'next';
kind_abs_5 constant text not null := 'absolute 5';
kind_rel_3 constant text not null := 'relative 3';
kind_rel_0 constant text not null := 'relative 0';
kind_rel_neg_5 constant text not null := 'relative -5';
kind_last constant text not null := 'last';
kind_prior constant text not null := 'prior';
begin
z := s.fetched_row(lang, kind_first, cur); return next;
z := '-----'; return next;
loop
z := s.fetched_row(lang, kind_next, cur);
exit when z is null;
/* */ return next;
end loop;
z := '-----'; return next;
z := s.fetched_row(lang, kind_abs_5, cur); return next;
z := s.fetched_row(lang, kind_rel_3, cur); return next;
z := s.fetched_row(lang, kind_rel_0, cur); return next;
z := s.fetched_row(lang, kind_rel_0, cur); return next;
z := s.fetched_row(lang, kind_last, cur); return next;
z := s.fetched_row(lang, kind_rel_neg_5, cur); return next;
z := '-----'; return next;
loop
z := s.fetched_row(lang, kind_prior, cur);
exit when z is null;
/* */ return next;
end loop;
if lang = 'plpgsql' then
z := '----- "move" tests (plpgsql only)'; return next;
begin
z := s.move_report(kind_first, cur); return next;
z := s.fetched_row(lang, kind_next, cur); return next;
z := s.move_report(kind_abs_5, cur); return next;
z := s.fetched_row(lang, kind_next, cur); return next;
z := s.move_report(kind_rel_3, cur); return next;
z := s.fetched_row(lang, kind_next, cur); return next;
z := s.move_report(kind_last, cur); return next;
z := s.move_report(kind_rel_neg_5, cur); return next;
z := s.fetched_row(lang, kind_next, cur); return next;
end;
end if;
end;
$body$;
Run it first in YB and then in vanilla PG. Spool the output (using the "\o" meta-command), using suitable names like yb.txt and pg.txt, so that you can diff them.
First, create the session-duration cursor and execute the few SQL fetch variants that get many rows each. (These cannot be used from PL/pgSQL.)
call s.declare_cur();
fetch 3 from cur;
fetch forward 2 from cur;
fetch forward all from cur;
fetch backward 4 from cur;
fetch backward all from cur;
Now invoke the _s.cur_rowssql() encapsulation using both its sql mode and its plpgsql mode:
select z from s.cur_rows_sql('sql', 'cur');
select z from s.cur_rows_sql('plpgsql', 'cur');
Diffing the two spool files shows that they are identical.
This is presented as a single file, issue-6514.sql inside the attached issue-6514.zip. It's totally self-contained and you can simply start it at the ysqlsh or the psql prompt—with one caveat: it starts with this:
drop schema if exists s cascade;
So, of course, you'll have to improvise a bit if the database that you connect to happens already to have a schema called s that's owned by a role other than the one as which you connect. Notice that the script uses this device to spool to a file that reflects whether you run the test in YB or PG:
select (version() like '%YB%')::text as is_yb
\gset
\if :is_yb
\o yb.txt
\else
\o pg.txt
\endif
The attached issue-6514.zip also contains the spool files yb.txt and pg.txt that were generated just before making the .zip.
Here, the code decorates each output line with an asterisk if it required warnings to be suppressed. This means that diff will indeed shown differences because, of course, you don't get any warnings in PG and so there's no need to suppress them. However, it's very easy to read over these asterisk as see that the actual returned values are identical in PB and PG.
What follows provides some background for understanding how things are supposed to work.
nocaway asked me a question in a YB-internal Slack exchange:
He didn't ask a second question. But he should have:
I can't find explicit answers to these questions in the PG doc. But the answers are implied by a general understanding of how query execution works and of how SQL statements define outcomes in the face of concurrent activity (where the effect of one session's changes affects the set of rows another session's SQL statement sees).
Here's a simple mental model for what it means to move and fetch, forwards and backwards at will in or from a cursor. Use this table as the target for the cursor's defining subquery:
create table s.t(k int primary key, v int not null);
select setseed(0.0::float8);
with s(v) as (select generate_series(1, 10))
insert into s.t(k, v)
select v, round(100*random()) from s;
Now inspect the seventh row in the result set for an unrestricted subquery with no order by against s.t. (Assume that you can be sure that no other sessions can change the contents of s.t during the test.) It doesn't matter that, as the rules of SQL promise, you cannot in general predict which row it will be. Even so, within the tiny timespans that matter in this test, repeating this query, time and again, will always get the same result—within a particular database in a particular PG or YB cluster.
with c(rnk, v) as (
select row_number() over(), v from s.t)
select rnk, v from c where rnk = 7;
The random() function always generates the same sequence of values in successive invocations that follow invoking setseed() with the same actual argument. This is the result (using YB):
rnk | v
-----+----
7 | 80
(It's different, of course, in PG because of the different physical storage systems that YB and PG use.)
Now declare a cursor and also "snapshot" the result set of its defining subquery into the table _pg_temp.cursorsimulation:
create table pg_temp.cursor_simulation(rnk int not null, v int not null);
start transaction;
declare cur scroll cursor without hold for
select row_number() over() as rnk, v from s.t;
insert into pg_temp.cursor_simulation(rnk, v)
select row_number() over(), v from s.t;
Once again, we can be sure that the subquery that defines both the cursor and the content of the _pg_temp.cursorsimulation table will produce the same result set each of the two times that it's executed. Inspect the seventh row here too:
fetch absolute 7 from cur;
select rnk, v from pg_temp.cursor_simulation where rnk = 7;
The result is the same for both methods—and is the same as the free-standing subquery got immediately before start transaction.
Finally, traverse the entire result set, in both directions, using first the cursor and then the _pg_temp.cursorsimulation table:
move last in cur;
move next in cur;
fetch backward all from cur;
move prior in cur;
fetch forward all from cur;
and:
select rnk, v from pg_temp.cursor_simulation order by rnk desc;
select rnk, v from pg_temp.cursor_simulation order by rnk asc;
The results are the same for both approaches, thus:
rnk | v
-----+----
10 | 78
9 | 77
8 | 39
7 | 80
6 | 55
5 | 28
4 | 34
3 | 20
2 | 84
1 | 91
and:
rnk | v
-----+----
1 | 91
2 | 84
3 | 20
4 | 34
5 | 28
6 | 55
7 | 80
8 | 39
9 | 77
10 | 78
In summary, this experiment informs a simple mental model:
Ensure that you issue rollback to finish the previous test. Set up for the next test as for the previous one. But make the choice of isolation level explicit.
start transaction isolation level read committed;
declare cur scroll cursor without hold for
select row_number() over() as rnk, v from s.t;
Now do this variant on move and fetch from the previous test:
move absolute 0 in cur;
fetch forward all from cur;
This is the result:
rnk | v
-----+----
1 | 91
2 | 84
3 | 20
4 | 34
5 | 28
6 | 55
7 | 80
8 | 39
9 | 77
10 | 78
So far, make sure that the session where you do this is the only session that's connected to the present database. Then connect to the same database as the same user in a second session and do this:
update s.t set v = v + 1000;
Make sure that, in this second session, the plain \set metacommand shows this usual setting:
AUTOCOMMIT = 'on'
Back in the first session, repeat the move and fetch from the cursor:
move absolute 0 in cur;
fetch forward all from cur;
If you do this test using PG, then you'll see that the results haven't changed—even though a second session has committed changes to the table that the cursor's subquery references and the first session is in flight in a read committed transaction. Emphasize the point by doing this (still within the same ongoing transaction):
select k, v from s.t order by k;
This does show the committed changes that the second session made, consistent with the proper semantics of the read committed isolation level.
k | v
----+------
1 | 1084
2 | 1039
3 | 1078
4 | 1080
5 | 1091
6 | 1020
7 | 1034
8 | 1077
9 | 1028
10 | 1055
So far, the results are identical, when you do the steps in this exact order, in both PG and in YB. However (because of a YB bug—see below) you'll see the intended behavior only in PG if you do the second session's update statement before the first move and fetch in the first session.
This is the intention:
Now stress test this rule by changing the order of the steps, thus:
When you do this in PG, the outcome is exactly the same as when you use the first order. But when you do in in YB, the move and fetch get a different outcome, thus:
rnk | v
-----+------
1 | 1091
2 | 1084
3 | 1020
4 | 1034
5 | 1028
6 | 1055
7 | 1080
8 | 1039
9 | 1077
10 | 1078
Now repeat the update statement in the second session and come back to the first session and repeat the move and fetch. Now the results are unchanged. In summary:
We must regard the PG behavior as tautologically correct. And we must therefore conclude this:
Of course, for the use cases for which YB is intended, concurrent data changes within the relevant tables is the common case.
The execution plan for the cursor's defining subquery can sometimes be executed correctly in both the forwards and the backwards directions. It's easy to picture this:
Here, the scheme that supports a forwards sequential scan over the table's rows can just as well support a backwards scan. Many execution plans are reversible in this way. (Experts on the plpgsql-general email list have suggested that this is the common case.) However, not all execution plans are reversible. Here is a simple but compelling example:
create function s.series()
returns table(v int)
set search_path = pg_catalog, pg_temp
language plpgsql
as $body$
begin
raise info 'series() invoked';
for v in (select generate_series(1, 11))loop
return next;
end loop;
end;
$body$;
create function s.filter(v in int)
returns boolean
set search_path = pg_catalog, pg_temp
language plpgsql
as $body$
begin
raise info '%', 'filter('||to_char(v, '99')||') invoked';
return (v < 10) and (v != all(array[1, 2, 4, 7]));
end;
$body$;
Now do this at the psql (or ysqlsh) prompt:
start transaction;
declare cur scroll cursor without hold for
select v from s.series() where s.filter(v);
It finishes silently and without error. Notice that the function s.f() is opaque to the planner. It simply returns its result set in the order that each next row is computed. And it's impossible for the planner to execute it any other way. This means that, in order to support the operations like fetch prior that declaring the cursor as scrollable allows, the results (at least the unfiltered results from the _generateseries() function) must be cached. I can't find this detail in the PG doc. But PG expert Laurenz Albe mentions it in his blog post WITH HOLD cursors and transactions in PostgreSQL.
Start with this to see the tests that are made on the output of _generateseries() to restrict it to the result set that the cursor's subquery defines:
move absolute 0 in cur;
fetch all from cur;
The move absolute 0 statement produces no SQL output and no raise info output. (This is because its meaning, "go to just before where the data starts", is fixed. The fetch all statement produces this SQL output:
3
5
6
8
9
And raise info reports this (the <-- annotations were added manually):
series() invoked
filter( 1) invoked
filter( 2) invoked
filter( 3) invoked <--
filter( 4) invoked
filter( 5) invoked <--
filter( 6) invoked <--
filter( 7) invoked
filter( 8) invoked <--
filter( 9) invoked <--
filter( 10) invoked
filter( 11) invoked
Now do this:
move relative -2 in cur;
It produces no SQL output. And raise info reports this:
filter( 11) invoked
filter( 10) invoked
filter( 9) invoked
filter( 8) invoked
Notice that this time, we do not see series() invoked in the raise info output. This, and the results from the previous move and fetch, together tell us that the unfiltered output from _generateseries() is cached—and now it's held in a structure that has frozen the order (whatever it was) in which the rows were produced. And it tells us that the where clause is applied anew to the cached raw data on every fetch or move invocation. If you create this function:
create function select_list_fn(i in int)
returns int
set search_path = pg_catalog, pg_temp
language plpgsql
as $body$
begin
raise info '%', 'select_list_fn('||to_char(i, '99')||') invoked';
return i;
end;
$body$;
and invoke it in the select list, then you'll see that it, too, is applied after the fact to the cached raw data.
Given these facts (that just the raw data is cached and that select list and where clause functions are applied to the cached data at fetch time, I cannot see how to design an experiment that would show whether or not ordinary table data is, or is not cached on the first fetch.
This detail seems to be undocumented and also seems to be pointlessly inefficient. (Contrast it with what happens, see below, when the cursor is declared as holdable.) I asked about it on the pgsql-general list HERE. But I didn't get a useful reply.
Here, it doesn't matter whether or not the execution plan is reversible because, by definition of holdable, the result set is materialized at cursor declaration time. Rollback to finish the example from the previous section and do this:
declare cur scroll cursor with hold for
select v from s.series() where s.filter(v);
We see no SQL output. But we do see raise info output already now, thus:
INFO: series() invoked
INFO: filter( 1) invoked
INFO: filter( 2) invoked
INFO: filter( 3) invoked
INFO: filter( 4) invoked
INFO: filter( 5) invoked
INFO: filter( 6) invoked
INFO: filter( 7) invoked
INFO: filter( 8) invoked
INFO: filter( 9) invoked
INFO: filter( 10) invoked
INFO: filter( 11) invoked
Now use the same move and fetch that we used in the previous section:
move absolute 0 in cur;
fetch all from cur;
It produces the same SQL result as in the without hold text, as expected:
3
5
6
8
9
But we see no raise info output. This tells us that—in contrast to how it's done for a without hold cursor—here (for a with hold cursor), it's the already-filtered result set that's cached (at the moment that the cursor is declared). If you create the function _select_listfn(i in int) and include this in the cursor's subquery, then you'll see that this, too, is taken account of at cursor declaration time to define the values that are cached.
There seems to be a very strange and arbitrary difference between how caching is done for the with hold cursor and the without hold cursor. There seems to be no rationale for deferring the use of select list and where clause functions at caching time and then applying them after the fact to the cached data. This difference in approach relatively harms the performance of the without hold case but makes no useful semantic difference. Apparently, the without hold cursor is more commonly used than is the with hold cursor. (Of course, you would see differences in output if the filter() function or the _select_listfn) function were not immutable. But using such a function in a SQL statement is anyway unsafe.)
Finally, because the entire result set for the cursor's subquery is materialized at declaration time, it is bound to be properly immune to the effects of changes made to the contents of the tables that the subquery references during the cursor's lifetime.
The attached issue-6514-2.zip contains a self-contained testcase that explores the degrees of freedom suggested by:
the proposal from nocaway in a Slack exchange to use a table thus:
create table s.h_k__v(
h int not null, k int not null, v int not null,
primary key(h, k));
instead of the table (paraphrased here):
create table s.k__v(
k int not null, v int not null,
primary key(k));
that the testcase issue-6514.zip used.
The ideas explained above in the section "What's expected when you scroll forwards and backwards in a cursor? And what's expected when the population of the result set that it defines is changed during the cursor's lifetime?".
This led to a complicated testcase:
This brings a total of thirty-five different tests. Moreover, some of the tests required interleaving table updates from a second session. So (without adding the complication of, say, a Python driver program) this implies following steps manually in two concurrent sessions.
Here are the high-level conclusions:
You get wrong results for all cursor definitions when the target table is updated between declaring the cursor and first fetching from it. And this happens both when the updates are issued in the same session that does the declare and fetch; and when the updates are issued from a second concurrent session. I made a small testcase that demonstrates the wrong results even when only supported fetch variants are used and I filed a new issue with that testcase:
Issue #16832 Wrong "INSENSITIVE" semantics for "FETCH FROM CURSOR"
You get two kinds of wrong results (on top of what the first bullet states) for this (but only this) cursor-defining subquery:
select ... from h_k__v where h = 1 order by k
Here, you always see the wrong rows when particular move and fetch sequences are used; and you see that updates done from a second session even after move and fetch have been used a few times, affect the values in the fetched rows—but only when you use a without hold cursor.
NOTE: nocaway also proposed testing with cursor subqueries that have no order by. Such cursor declarations are included in the kit but are not used in the tests that are described. You can try them yourself by hand. But you can't mechanically compare the YB results with the PG results because differences in the storage mechanisms bring, with no ordering, different orderings in the two envs. Therefore, when the tests sample just the first few, the middle few, and the last few rows, all the values that you see are different. Even so, it's easy to conclude, by manual inspection:
that these two defining subqueries behave the same as each other:
select ... from k__v order by k
select ... from k__v
and do not show the second kind of wrong results.
and that these two defining subqueries behave the same as each other:
select ... from h_k__v where h = 1 order by k
select ... from h_k__v where h = 1
and do show the second kind of wrong results.
Without tracing what's going on in the underlying implementation, these results seem to be bizarrely random. It's impossible to tell, therefore, if yet more tests would reveal yet more errors.
However, the high-level conclusion is clear. Until the root causes are identified and fixed:
And even this conclusion is subject to the caveat that YB is vulnerable to DML from other sessions in the race condition window between declaring a cursor and first fetching from it—while PG is not vulnerable in this way.
To run the testcase (for the case that no concurrent table changes are made) simply define the psql variables db and u:
\set db < a convenient scratch database >
\set u < a user that has "create" on the database you chose >
and invoke 0.sql at the psql or ysqlsh prompt. You'll have to read this whole comment section to learn how to run the testcase in all of its variants.
issue-6514-2.zip contains these files:
0.sql
1-cr-tables.sql
2-cr-proc-open-cursor.sql
3-cr-fn-rows-from-cursor.sql
4-cr-rule-off.sql
5-run-test.sql
6-run-all-tests.sql
This has the two create table statements for the tables k__v and _hk__v shown above and these create procedure statements to populate and to update them:
create procedure s.populate_tab_k__v(no_of_nows in int)
set search_path = pg_catalog, pg_temp
language plpgsql
as $body$
begin
truncate table s.k__v;
with s(v) as (select generate_series(1, no_of_nows))
insert into s.k__v(k, v)
select v, (no_of_nows*10 + v) from s;
end;
$body$;
create procedure s.populate_tab_h_k__v(no_of_nows in int)
set search_path = pg_catalog, pg_temp
language plpgsql
as $body$
begin
-- Symmetrical with the code for table "s.k__v".
...
insert into s.h_k__v(h, k, v)
select 1, v, (no_of_nows*10 + v) from s;
end;
$body$;
create procedure s.update_tables(do_it in boolean)
set search_path = pg_catalog, pg_temp
language plpgsql
as $body$
begin
if do_it then
update s.k__v set v = v + 90;
update s.h_k__v set v = v + 90;
end if;
end;
$body$;
This creates a procedure to declare the cursor cur in nine different ways:
create procedure s.open_cursor(mode in int)
set search_path = pg_catalog, pg_temp
language plpgsql
as $body$
declare
prelude constant text not null := 'declare cur scroll cursor without hold for select ';
s_list constant text not null := 's.fmt(k)||s.fmt(v)';
begin
begin
execute 'close cur';
exception when invalid_cursor_name then null;
end;
case mode
...
when 4 then execute
prelude||s_list||
' from s.h_k__v where h = 1 order by k';
...
when 9 then execute
replace(prelude, 'without', 'with')||s_list||
' from s.h_k__v where h = 1 order by k';
end case;
end;
$body$;
It also creates the trivial fmt() helper function:
create function s.fmt(i in int)
returns text
set search_path = pg_catalog, pg_temp
language plpgsql
as $body$
begin
return to_char(i, '9999999');
end;
$body$;
And it creates a trivial function to display the columns _isholdable and statement for the cursor cur from the _pgcursors catalog view:
create function s.defining_subquery()
returns text
set search_path = pg_catalog, pg_temp
language plpgsql
as $body$
declare
with_hold boolean not null := false;
stmt text not null := '';
begin
select is_holdable, statement
into strict with_hold, stmt
from pg_cursors where name = 'cur';
-- Improve the readability.
declare
holdable constant text not null := case with_hold
...
end;
subq constant text not null := replace(...stmt...);
begin
return holdable||subq;
end;
end;
$body$;
This is used to annotate the spooled testcase output. Here is its output for the seven modes that the mechanical test uses. (Whitespace has been added by hand.)
mode
----
2 not holdable select ... from k__v order by k
4 not holdable select ... from h_k__v where h = 1 order by k
5 not holdable select ... from h_k__v where h = 1 order by v
6 not holdable select ... from h_k__v order by k
7 not holdable select ... from h_k__v order by v
8 holdable select ... from k__v order by k
9 holdable select ... from h_k__v where h = 1 order by k
This creates the table function _s.rows_fromcursor(). It performs these operations:
Forwards and backwards at the start
------------------------------------
fetch first from cur
fetch next from cur
fetch prior from cur
-- Check the count.
------------------------------------
move absolute 0 in cur
move forward all in cur
get diagnostics n = row_count
assert n = cnt
Backwards and forwards at the end
------------------------------------
fetch prior from cur
fetch prior from cur
fetch next from cur
Forwards and backwards in the middle
------------------------------------
move absolute 0 in cur
move relative cnt/2 in cur
fetch relative 0 from cur
fetch next from cur
fetch prior from cur
Backwards and forwards in the middle
------------------------------------
fetch relative 0 from cur
fetch prior from cur
fetch next from cur
Forwards and backwards at the start
------------------------------------
fetch first from cur
fetch next from cur
fetch prior from cur
The return value after each is compared with the expected value. The expected value is defined, tautologically, as what the test produces in vanilla PG. There, the set of results that the schedule of fetches produces is the same in all _thirtyfive cases (each of the seven cursor declarations that the mechanical test uses combined with each of the five concurrent update variants). Sometimes, in YB, the fetch finds no row—because of the bugginess that the testcase reveals. This is detected as a null return value. And this is transformed, using coalesce(), into \<no row> for readability. Only when the return value differs from the expected value does the table function return it. This makes the buggy cases very easy to spot.
This creates the trivial function _s.ruleoff() to add captions like this to the spooled output. Here's an example that shows a buggy outcome:
================================================================================
not holdable: select ... from s.h_k__v where h = 1 order by k
================================================================================
--- First attempt ------------------------------------------
Forwards and backwards at the start
3 100003
Backwards and forwards at the end
<no row>
<no row>
<no row>
Forwards and backwards in the middle
5002 105002
5003 105003
5004 105004
Backwards and forwards in the middle
5006 105006
5007 105007
5008 105008
Forwards and backwards at the start
3 100003
--- Second attempt -----------------------------------------
Forwards and backwards at the start
3 100003
...
Here it is:
\c :db :u
set client_min_messages = error;
call s.populate_tab_k__v(10000);
call s.populate_tab_h_k__v(10000);
\if :without_hold
start transaction;
call s.open_cursor(:open_cursor_mode);
select s.rule_off(s.defining_subquery());
/*
In this session OR in concurrent session.
\c :db :u
*/;
call s.update_tables(:update_before_first_cursor_use);
select s.rule_off('First attempt', 2);
select s.rows_from_cursor();
select s.rule_off('Second attempt', 2);
select s.rows_from_cursor();
/*
In this session OR in concurrent session.
\c :db :u
*/;
call s.update_tables(:update_after_first_cursor_use);
select s.rule_off('Third attempt', 2);
select s.rows_from_cursor();
rollback;
\else
start transaction;
call s.open_cursor(9);
select s.rule_off(s.defining_subquery());
/*
In this session OR in concurrent session:
\c :db :u
*/;
call s.update_tables(:update_before_first_cursor_use);
commit;
select s.rule_off('First attempt', 2);
select s.rows_from_cursor();
/*
In this session OR in concurrent session:
\c :db :u
*/;
call s.update_tables(:update_after_first_cursor_use);
select s.rule_off('Second attempt', 2);
select s.rows_from_cursor();
\endif
Notice the psql variables whose values are accessed using _:withouthold, _:open_cursormode, _:update_before_first_cursoruse, and _:update_after_first_cursoruse. Three of the concurrent update scenarios can be run mechanically. In these cases for concurrent updates:
the testing can be done simply by invoking 5-run-test.sql from the driver script 6-run-all-tests.sql (below) and by setting _update_before_first_cursoruse and _update_after_first_cursoruse before each run.
But the remaining two cases for concurrent updates:
must be stepped through manually by setting the psql variable _open_cursormode by hand, switching to the second session at the right moment to issue _call s.updatetables(true) and then switching back to the first session to finish the sequence. Notice that you must choose manually which leg of the _\if :withouthold meta-command test to run to match the value that you set for _open_cursormode. Values 2, 4, 5, 6, and 7 call for setting _withouthold to true. And values 8 and 9 call for setting it to false.
Here it is:
\c :db :u
set client_min_messages = error;
-- Set these in turn to "false, false", "true, false", or "false, true".
\set update_before_first_cursor_use false
\set update_after_first_cursor_use false
select (version() like '%YB%')::text as is_yb
\gset
\t on
\if :is_yb
\o output/yb.txt
\else
\o output/pg.txt
\endif
\set without_hold true
\set open_cursor_mode 2
\ir 5-run-test.sql
\set open_cursor_mode 4
\ir 5-run-test.sql
\set open_cursor_mode 5
\ir 5-run-test.sql
\set open_cursor_mode 6
\ir 5-run-test.sql
\set open_cursor_mode 7
\ir 5-run-test.sql
\set without_hold false
\set open_cursor_mode 8
\ir 5-run-test.sql
\set open_cursor_mode 9
\ir 5-run-test.sql
\o
\t off
Here they are:
Cursor'ssubquery | Holdable | No updates | Updates insame sessionbefore first fetch | Updates insame sessionafter first fetch | Updates inother sessionbefore first fetch | Updates inother sessionafter first fetch |
---|---|---|---|---|---|---|
select …from k__vorder by k | No | OK | Buggy | OK | Buggy | OK |
select …from h_k__vwhere h = 1order by k | No | Buggy | Doublebuggy | Buggy | Doublebuggy | Doublebuggy |
select …from h_k__vwhere h = 1order by v | No | OK | Buggy | OK | Buggy | OK |
select …from h_k__vorder by k | No | OK | Buggy | OK | Buggy | OK |
select …from h_k__vorder by v | No | OK | Buggy | OK | Buggy | OK |
select …from k__vorder by k | Yes | OK | Buggy | OK | Buggy | OK |
select …from h_k__vwhere h = 1order by k | Yes | OK | Buggy | OK | Buggy | OK |
For all tests except for this:
mode
----
4 not holdable select ... from h_k__v where h = 1 order by k
"Buggy" means that the cursor sees the effect of updates that are made to the table it queries when these are made, either in the same session or in a second session, between the invocation of _opencursor() and the first invocation of _rows_fromcursor().
And for the tests for this exceptional holdability and subquery, plain "Buggy" means that any fetch after scrolling backwards sees the wrong row or no row at all; and "Double buggy" means what plain "Buggy" means together with seeing the effect of changes made to the underlying table during the cursor's lifetime. Notice the here, and only here, this departure from the proper behavior (MVCC as of the moment that the cursor is opened) occurs in one extra case: when the updates are done in a second session even after having already invoked _rows_fromcursor() a few times.
It's useful to look at move at fetch outcomes for the problem cursor definition. Before starting, make sure that you're not in an ongoing transaction:
rollback;
(If _client_minmessages happens to be set to error, then rollback will always finish silently, whether or not you're not in an ongoing transaction, and leave you with no ongoing transaction.) Now set up like this. (Only basic, top-level, SQL statements, with the exception of one anonymous block, are used here in this section.)
truncate table s.h_k__v;
with s(v) as (select generate_series(1, 100))
insert into s.h_k__v(h, k, v)
select 1, v, (1000 + v) from s;
start transaction;
declare cur scroll cursor without hold for
select k, v from s.h_k__v where h = 1 order by k;
First, make sure that none of these statements draws a warning and produces correct results:
set client_min_messages = warning;
fetch from cur;
fetch next from cur;
fetch forward from cur;
fetch 3 from cur;
fetch forward 3 from cur;
Here are its results to date:
1 | 1001
2 | 1002
3 | 1003
4 | 1004
5 | 1005
6 | 1006
7 | 1007
8 | 1008
9 | 1009
All other fetch variants, and all of the move variants draw the 42601 warning, for example:
fetch relative 0 from cur;
Moreover, it gets the wrong answer, thus:
11 | 1011
And simply repeating it (which you might think would get the same wrong answer), gets a new wrong answer:
13 | 100013
Now suppress the warnings and try some other examples:
rollback;
set client_min_messages = error;
start transaction;
declare cur scroll cursor without hold for
select k, v from s.h_k__v where h = 1 order by k;
fetch first from cur;
fetch next from cur;
fetch prior from cur;
Here are the three results, in order:
1 | 1001
2 | 1002
3 | 1003
The first two results are correct; but the third is wrong.
This means that the fact that fetch prior draws just warning is (arguably) itself a bug. It should (arguably) cause a straight error—and produce no result;
You can repeat this sequence of three fetch statements, still in the same ongoing transaction, as many times as you want. The results are always the same. This seems to suggest that fetch first always gets the right result and that it's therefore a bug that it draws the warning:
0A000: FETCH FIRST not supported yet
Now use the popular technique that, in PG, famously gets you the correct count and correct rows, at the moment that the cursor is declared, even in the presence of heavy concurrent changes to the table from other concurrent sessions:
do $body$
declare
cur constant refcursor not null := 'cur';
n int not null := 0;
begin
move absolute 0 in cur;
move forward all in cur;
get diagnostics n = row_count;
assert n = 100, 'Bad count';
end;
$body$;
It finishes silently, time and again. This seems to suggest that move absolute 0 always gets the right result and that it's therefore a bug that it draws the (wrongly worded) warning:
0A000: FETCH ABSOLUTE not supported yet
It's then followed by this:
0A000: MOVE not supported yet
Now look at the effect of concurrent table changes from another session:
rollback;
set client_min_messages = error;
drop function if exists s.f(int) cascade;
create function s.f(i in int)
returns text
set search_path = pg_catalog, pg_temp
language sql
as $body$
select i::text;
$body$;
truncate table s.h_k__v;
with s(v) as (select generate_series(1, 100))
insert into s.h_k__v(h, k, v)
select 1, v, (1000 + v) from s;
\t on
start transaction;
declare cur scroll cursor without hold for
select s.f(k), v
from s.h_k__v
where h = 1
order by k
;
fetch first from cur;
fetch next from cur;
fetch prior from cur;
/*
DO BY HAND
/* In second session*/ update s.h_k__v set v = v + 90;
fetch first from cur;
*/;
It produces these results. First:
1 | 1001
2 | 1002
3 | 1003
And then, after updating the table in the second session:
1 | 1091
So this test has shown the unique "double-buggy" outcome with just a few top-level fetch statements and without the possible complication of invoking the move and fetch operations using PL/pgSQL.
Notice, however, the manifestation of the final wrong value for v, 1091 instead of 1001, is sensitive to what seem to be irrelevant changes. It goes away if you make any one of these changes, by itself, with respect to the code as presented:
Notice that you can remove "order by k" altogether. Because of the table's definition and how YB works under the covers, this doesn't change the ordering of the results. This change preserves the buggy behavior!
Notice that all of these tests produce only correct results using vanilla PG.
Cursor functionality in YB, as of this comment's timestamp, is seen to be massively buggy when you suppress warnings and test the unsupported operations. But when you avoid these, you lose almost all of the functionality that makes the feature interesting. The PG community regards the functionality as very useful. For example, PG expert Laurenz Albe has published these two recent blog posts on the topic:
Both of these posts use operations that draw the "unsupported" warning in YB and that, in certain tests, produce wrong results. He also uses the functionality in code that he's published for migrating data from foreign sources into PG. See HERE.
The fact that my tests show many cases of bizarre, but reproducible, wrong behavior makes me suspect that the must be some gross bugs in YB's implementation—so gross that the causes ought to be identifiable by single-stepping through the internal implementation for these tests.
I have a hunch that the reversibility property for the candidate plans for the subquery that defines a cursor might be one of the culprits. And maybe the code, adopted from PG, that intends to determine the reversibility property for a plan hasn't been properly modified to accommodate the fact that YB's storage model is so very different from PG's. This hunch is strengthened by the fact that the result set is always cached for a with hold cursor and that, here, the random navigation that you do with move and positional fetch never gets wrong results. (The fact that the MVCC moment is advanced in YB, w.r.t. PG, from the moment that declare is executed seems to be a separable problem.)
It would be interesting to intervene in the reversibility test so that it always says "not reversible" and then to try all my testcases.
Look at the one row in the table of results above that has cells that say "Double buggy". This is the full select that defines the cursor for that row:
select s.fmt(k)||s.fmt(v)
from from s.h_k__v
where h = 1 order by k
I added a new version of the procedure _s.opencursor() to examine this more closely. I put it in this file:
investigating-effect-of-selecting-row-number/2s-cr-proc-open-cursor.sql
This is its content:
drop procedure if exists s.open_cursor(int) cascade;
create procedure s.open_cursor(mode in int)
set search_path = pg_catalog, pg_temp
language plpgsql
as $body$
declare
prelude constant text not null := 'declare cur scroll cursor without hold for select ';
s_list_1 constant text not null := 's.fmt(0)|| s.fmt(k)||s.fmt(v)';
s_list_2 constant text not null := 's.fmt((row_number() over())::int)||s.fmt(k)||s.fmt(v)';
begin
begin
execute 'close cur';
exception when invalid_cursor_name then null;
end;
case mode
when 4 then execute
prelude||s_list_1||' from s.h_k__v where h = 1 order by k';
when 99 then execute
prelude||s_list_2||' from s.h_k__v where h = 1 order by k';
end case;
end;
$body$;
These, then, are the defining select statements for the two modes:
-- mode => 4
select s.fmt(0)||s.fmt(k)||s.fmt(v)
from from s.h_k__v
where h = 1 order by k
and:
-- mode => 99
select s.fmt((row_number() over())::int)||s.fmt(k)||s.fmt(v)
from from s.h_k__v
where h = 1 order by k
I added a corresponding new version of the script to run the tests to focus on just the two modes 4 and 99 in this file:
investigating-effect-of-selecting-row-number/6s-run-all-tests.sql
This is its content:
\set db d0
\set u d0$u0
:c
\c :db :u
set client_min_messages = error;
\set VERBOSITY verbose
set search_path = pg_catalog, pg_temp;
--------------------------------------------------------------------------------
\ir 2s-cr-proc-open-cursor.sql
\set update_before_first_cursor_use false
\set update_after_first_cursor_use false
select (version() like '%YB%')::text as is_yb
\gset
\t on
\if :is_yb
\o output/yb.txt
\else
\o output/pg.txt
\endif
\set without_hold true
\set open_cursor_mode 4
\ir ../5-run-test.sql
\set open_cursor_mode 99
\ir ../5-run-test.sql
\o
\t off
I didn't bother with the other combinations for _update_before_first_cursoruse and _update_after_first_cursoruse.
Simply cd to the investigating-effect-of-selecting-row-number directory (on the directory where you unzipped issue-6514-2.zip) and create the two files 2s-cr-proc-open-cursor.sql and 6s-run-all-tests.sql there using the content shown above. Then create a directory output as a peer to these two scripts and start 6s-run-all-tests.sql first using PG and then using YB. This is what I saw:
This observation strengthens the theory that the YB bugginess stems from this:
Maybe the plan could be executed in the backwards direction, but this is simply implemented wrong. Or maybe it can't be reversed, and so the code that detects reversibility needs to detect such cases.
As I reason it, the plan for a select that has _rownumber() in the select list item can't be reversed because the successive values for _rownumber() can only emerge as the rows are produced one-by-one. And when the plan starts to produce rows, it has no idea how many there'll be. So it can't count downwards from the total number of rows.
I'm guessing that we still use the code that we inherited from PG that detects when _rownumber() over() is present and returns "non-reversible". Because this causes the result set (in the forwards direction) to be cached, the net effect is bound to be correct for backwards operations.
This account describes the overall design of some tests that explore if/when the result set that a cursor defines is cached and how the possible caching affects the actual values that move and fetch produce when volatile functions are used in the subquery that defines that cursor.
It also draws some conclusions from the results.
The attached issue-6514-3.zip contains these three .sql scripts:
The 0-*.sql script is the master script. It invokes the 1-*.sql and 2-*.sql scripts. The scripts produce both raise info output—from the functions s.f1(), s.f2(), and s.series()—and ordinary .sql output. Commands like this, at the O/S prompt (connecting both to YB and to PG) are used to produce spool files that interleave the raise info output with the SQL output:
psql -h u -p 5433 -d ... -U ... -f 0-result-set-caching-tescase.sql &> yb.txt
Reference copies of the two spool files, yb-0.txt and pg-0.txt, are also included in issue-6514-3.zip.
These tests use three helper functions, thus:
s.f1() (volatile) — This uses a user-defined run-time parameter my.var to hold (the text typecast of) an int value as its state. Each successive invocation increments the state by 1 and returns that new int value. This function is sometimes used in the select list of the subquery that defines a cursor.
s.f2() (volatile) — This is a variant of s.f1(). It holds and increments state in the same way. But it uses a case expression to compute the return value thus:
return
case
when v between 1 and 5 then 1
when v between 6 and 10 then 2
when v between 11 and 15 then 3
when v between 16 and 20 then 4
when v between 21 and 25 then 5
else 42
end;
This function is sometimes used in the where clause of the subquery that defines a cursor.
s.series() (immutable) — This is simply a wrapper for _generateseries(1, 5).
Each of these functions uses raise info to report when it's invoked.
The test declares a cursor in various different ways to explore these degrees of freedom:
Without hold or with hold.
The from list for the cursor's defining subquery uses:
Including s.f1() in the select list or not.
Including the restriction where v = s.f2() or not.
Including _rownumber() over(order by v) in the select list or not.
These five binary degrees of freedom bring 32 different combinations. But, here, only six of these are included in the test to keep it manageably small. This is enough to show the general pattern:
First step through the testcase by hand, observing how the raise info output interleaves with the SQL output. It's enough to do this using just YB.
Then do this, at the O/S prompt, to get both the raise info output and the SQL output into the same spool file:
# YB
psql -h u -p 5433 -d d0 -U d0\$u0 -f 0-result-set-caching-tescase.sql &> yb.txt
# Vanilla PG
psql -h u -p 5432 -d d0 -U d0\$u0 -f 0-result-set-caching-tescase.sql &> pg.txt
And then diff yb.txt and pg.txt. You'll see that they are identical. The attached issue-6514-3.zip contains reference copies of these as yb-0.txt and pg-0.txt.
Read the script to see the detail. This is the outline.
drop schema if exists s cascade;
create schema s;
create function s.caption(t in text, main in boolean = false)
returns table(z text)
...
language plpgsql
as $body$
...
$body$;
create function s.f1()
returns int
...
language plpgsql
as $body$
...
$body$;
create function s.f2()
returns int
...
language plpgsql
as $body$
...
$body$;
create function s.series()
returns table(v int)
...
language plpgsql
as $body$
...
$body$;
create table s.t(k serial primary key, v int not null);
select s.caption('Populating table s.t,', true);
insert into s.t(v) select v from s.series();
/*
Sanity test. In general, you can't rely on the order of evaluation
of the select list items. But the results of these little tests seem
to be repeatable.
set my.var = 0;
select s.f1() as f1, s.f1() as f2, s.f1() as f3, s.f1() as f4, s.f1() as f5;
set my.var = 0;
select s.f2() as f2, s.f2() as f2, s.f2() as f3, s.f2() as f4, s.f2() as f5;
*/;
The s.f1() test gets this result on the first invocation of the select:
1 | 2 | 3 | 4 | 5
And you see "INFO: f1() invoked" five times. Successive repeats of the same select return ever-increasing values from f1() so that repeating the select for the second time produces this:
6 | 7 | 8 | 9 | 10
The s.f2() test gets this result on the first invocation of the select. And you see "INFO: f1() invoked" five times.
1 | 1 | 1 | 1 | 1
The first repeat of the same select returns this:
2 | 2 | 2 | 2 | 2
And so it goes on. The between ranges in f2() were chosen so that it will return the same value for each row when it's called thus:
select s.f2() from s.series(5);
The script precedes each substantive fetch or move statement with s.caption() to help the readability of the spooled output, thus:
select s.caption('fetch first'); fetch first from cur;
select s.caption('fetch first'); fetch first from cur;
select s.caption('move relative 3'); move relative 3 from cur;
select s.caption('fetch forward all'); fetch forward all from cur;
select s.caption('fetch prior'); fetch prior from cur;
select s.caption('fetch backward all'); fetch backward all from cur;
select s.caption('fetch first'); fetch first from cur;
Notice that the very same \ir 2-fetch-and-move.sql is used in all tests.
set my.var = 0;
start transaction;
select s.caption('declare');
declare cur scroll cursor without hold for
select s.f1(), v from s.t order by v;
\ir 2-fetch-and-move.sql;
rollback;
These are the SQL results:
1 | 1
2 | 1
6 | 5
7 | 5
8 | 4
9 | 3
10 | 2
11 | 1
12 | 1
Look at one of the two identical spool files to see how the raise info output interleaves with the SQL output. It always comes before the SQL output.
The outcomes are consistent with an implementation where the results are delivered simply by executing the plan explicitly in the forward or backward direction as requested, including therefore evaluating s.f1(), for every single fetch or move execution.
NOTE: These observations are also consistent with an implementation that caches the raw result set from scanning the table and that invokes f1() whenever the cache is accessed. The test cannot distinguish between the two alternative hypotheses.
This is identical to Test 1 except for the difference that this:
set my.var = 0;
start transaction;
select s.caption('declare');
declare cur scroll cursor with hold for
select s.f1(), v from s.t order by v;
select s.caption('commit');
commit;
\ir 2-fetch-and-move.sql;
close cur;
In other words, without hold is replaced by with hold and the explicitly started transaction that declares the cursor is immediately committed. Then each fetch is done in ordinary autocommit mode. And then the cursor is explicitly closed with close cur.
Here, we see INFO: f1() invoked exactly five times immediately following commit.
And here, we don't see any more raise info output. These are the SQL results:
1 | 1
1 | 1
5 | 5
5 | 5
4 | 4
3 | 3
2 | 2
1 | 1
1 | 1
The values of v are the same as in Test 1. But the values reported for f1() are different. Now they are neatly paired with the values for v.
These results are consistent with an implementation where the entire result set is cached, when the transaction that creates the with hold cursor issues commit. Then all cursor accesses using fetch or move access the cached result set. Notice that (in contrast to what Test 3 shows) the result of invoking f1() in the select list is also cached.
set my.var = 0;
start transaction;
select s.caption('declare');
declare cur scroll cursor without hold for
select s.f1(), v from s.series();
\ir 2-fetch-and-move.sql;
rollback;
In other words, it's is identical to Test 1 except for the single difference that this:
from s.t order by v
is replaced with this:
from s.series(5)
The thinking is that, because the immutable user-defined function s.series() is opaque, the execution plan cannot be run backwards—and so at least the raw results must be cached. (The order by is dispensed with in Test 2 because the results from s.series() are guaranteed to be produced in the order that they are computed.) The fact that s.series() starts with raise info 'series() invoked' allows this hypothesis to be tested.
The SQL output for Test 3 is identical to that for Test 1 (as it must be for correctness). But the raise info output differs in just one way. Test 3's output starts with "INFO: series() invoked" while Test 1's output, of course, does not. This message is never seen again. (The "INFO: f1() invoked" output is identical in both tests.)
This tells us that, at least in this test, the raw results from executing s.series() are cached while the function f1) is invoked when a result is fetched from the cache.
This is identical to Test 1 except that it extends the select list with _rownumber(). The idea is that the plan for such a query cannot be reversible.
set my.var = 0;
start transaction;
select s.caption('declare');
declare cur scroll cursor without hold for
select row_number() over(order by v) as pos, s.f1(), v from s.t order by v;
\ir 2-fetch-and-move.sql;
rollback;
These are the SQL results:
1 | 1 | 1
1 | 1 | 1
5 | 5 | 5
5 | 5 | 5
4 | 4 | 4
3 | 3 | 3
2 | 2 | 2
1 | 1 | 1
1 | 1 | 1
These results are, for the select list items s.f1() and v, identical to those for Test 2. And the results for _rownumber() line up with those for v, just as the order by demands. When you look at the spool file, you'll see that the INFO: f1() invoked output is seen only when a row that has not so far been fetched or moved over is fetched or moved over for the first time. These observations are consistent with the hypothesis that a row is added to the result cache on demand when it is first produced. Moreover, the cached row value includes the effect of the select list function so that it is not re-invoked when the cache is accessed.
This is the same as _Test 1 except that it removes s.f1() from the select list and adds the restriction where v = s.f2().
set my.var = 0;
start transaction;
select s.caption('declare');
declare cur scroll cursor without hold for
select v from s.t where v = s.f2() order by v;
\ir 2-fetch-and-move.sql;
rollback;
These are the SQL results:
1
1
1
1
When you look at the spool file, you'll see that the INFO: f2() invoked output is seen five times before the very first fetch first result is produced. Recall that f2() returns the value 1 for each of its first five invocations—so, for one point-of-view, just a single invocation of f2() is sufficient to produce the first result. The fact that f2() is invoked five times before the first result is seen is consistent with the hypothesis that analysis of the select statement or of its plan produced the decision that the entire already restricted result set should be cached when the very first fetch or move is done. Of course, then, the result set has just the single row for v = 1. Any subsequent fetch will either get that row—or get no row at all if the current position is that of the single result.
This is the same as Test 5 except that it uses from s.series() without order by in stead of accessing the table s.t using an order by.
start transaction;
select s.caption('declare');
declare cur scroll cursor without hold for
select v from s.series() where v = s.f2();
\ir 2-fetch-and-move.sql;
rollback;
These are the SQL results:
1
1
2
3
Notably, they're different from those from Test 5 because of caching differences.
When you look at the spool file, you'll see that first INFO: series() invoked is seen (and never seen again). Then the INFO: f2() invoked output is seen a single time and then the first SQL result is seen. Then the INFO: f2() invoked output is seen a single time again and then the second SQL result is seen. The second SQL result happens to be the same as the first SQL result because s.f2() produces the same result on its second invocation as on its first invocation. And so it goes on.
The critical point here is that, though the row source delivers the same values for v in Test 6 and Test 5, the final SQL results are different. These results are consistent with the hypothesis that analysis of the select statement or of its plan produced the decision that the raw row source results should be cached when the very first fetch or move is done and that, thereafter, every fetch or move applies the where v = s.f2() restriction to the results that are retrieved from the cache. The deliberately contrived pattern of f2()'s volatile production of return values explains the rest of what the spool file shows.
Because f1() and f2() were designed deliberately to be volatile, the different ways that caching is done lead to genuine semantic differences in the results. The PG doc (in the Cursors subsection of the PL/pgSQL chapter HERE) has this caution:
use NO SCROLL with a query that involves volatile functions. The implementation of SCROLL assumes that re-reading the query's output will give consistent results, which a volatile function might not do.
This caveat, together with the observed results, lead to this conclusion:
The user has no reliable way to predict, in general, when and how result-set caching will be done. For example: the raw values from the row source might be cached; or the results after the use of select list or where clause functions might be cached. The results from a scrollable cursor whose defining subquery involves a volatile function must therefore be regarded as unpredictable and, because of that, unreliable.
This, in turn, suggests that YB need not try to mimic PG's whimsical caching rules. On the other hand, if this outcome always, or even just sometimes, emerges from the direct re-use of the PG code, then this cannot be considered to be harmful.
Try this in top-level SQL just as revision:
drop schema if exists s cascade;
create schema s;
create table s.t(k serial primary key, v int not null);
insert into s.t(v) select generate_series(0, 99, 5);
start transaction;
declare cur cursor without hold for
select t.k, t.v
from s.t
where t.k between 6 and 11
order by t.k;
fetch first from cur;
rollback;
It causes the 0A000 error:
FETCH FIRST not supported yet
Replace fetch first from cur; with fetch last from cur; — the outcome is the same error but now with this message:
FETCH LAST not supported yet
This is the hint for both:
See https://github.com/yugabyte/yugabyte-db/issues/6514. React with thumbs up to raise its priority
Now try the same thing using a PL/pgSQL function:
create function s.f(k_lo in int, k_hi in int)
returns table(k int, v int)
set search_path = pg_catalog, pg_temp
language plpgsql
as $body$
declare
cur refcursor not null := 'cur';
begin
open cur for (
select t.k, t.v
from s.t
where t.k between k_lo and k_hi
order by t.k);
fetch first from cur into k, v; return next;
fetch last from cur into k, v; return next;
close cur;
end;
$body$;
select k, v from s.f(6, 11);
It all works without error and produces this result:
k | v
----+----
6 | 25
11 | 50
This inconsistency in behavior is beyond belief.
Jira Link: DB-4854 The following FETCH options are not yet supported.