Open sunshowers opened 9 months ago
Hello, I am Blathers. I am here to help you get the issue triaged.
Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here.
I have CC'd a few people who may be able to assist you:
If we have not gotten back to your issue within a few business days, you can try the following:
:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.
We hit this because with the Rust Diesel ORM, the only way to express such a query appears to be with style 1.
For anyone searching for this in the future -- turns out this is not quite correct because Diesel's FilterDsl
trait is implemented for these kinds of statements, even though it doesn't show up in the documentation.
Interestingly, Postgres itself has the same issue, where the WHERE clause is silently ignored with the above queries. However, Postgres does error out for non-existent columns.
Thanks for opening an issue!
(Though the Postgres docs say in style 1, the WHERE only takes an index_predicate, which I'm not sure what the difference is)
My understanding of this WHERE
before the DO UPDATE
is that it is not the same as a normal WHERE
, and is only used to pick a partial unique index to use as the arbiter index. (I think it's supposed to match the WHERE
that would be in the CREATE UNIQUE INDEX ... (...) WHERE ...
statement of a partial unique index.)
Looking at the plan of your insert, we can see that the ON CONFLICT (id) WHERE data = 2
is causing us to pick the primary index as the arbiter, and we're ignoring the WHERE data = 2
since there is no partial unique index with this predicate.
demo@127.0.0.1:26257/demoapp/defaultdb> EXPLAIN INSERT INTO foo (id, data) VALUES ('fe0abe6b-f07a-43c4-a82f-9258a6ed89cc', 1) ON CONFLICT (id) WHERE data = 2 DO UPDATE SET data = 3;
info
----------------------------------------------------------------------------------------------------------------
distribution: local
vectorized: true
• upsert
│ into: foo(id, data)
│ auto commit
│ arbiter indexes: foo_pkey <- picking primary index as arbiter
│
└── • render
│
└── • render
│
└── • cross join (left outer)
│ estimated row count: 1
│
├── • values
│ size: 2 columns, 1 row
│
└── • scan
estimated row count: 1 (100% of the table; stats collected 7 minutes ago)
table: foo@foo_pkey
spans: [/'fe0abe6b-f07a-43c4-a82f-9258a6ed89cc' - /'fe0abe6b-f07a-43c4-a82f-9258a6ed89cc']
locking strength: for update
If we add WHERE foo.data = 2
as a normal WHERE
clause at the end, it does show up in the plan:
demo@127.0.0.1:26257/demoapp/defaultdb> EXPLAIN INSERT INTO foo (id, data) VALUES ('fe0abe6b-f07a-43c4-a82f-9258a6ed89cc', 1) ON CONFLICT (id) WHERE data = 2 DO UPDATE SET data = 3 WHERE foo.data = 2;
info
--------------------------------------------------------------------------------------------------------------------
distribution: local
vectorized: true
• upsert
│ into: foo(id, data)
│ auto commit
│ arbiter indexes: foo_pkey
│
└── • render
│
└── • render
│
└── • filter
│ estimated row count: 1
│ filter: (id IS NULL) OR (data = 2) <- filter on WHERE foo.data = 2 predicate
│
└── • cross join (left outer)
│ estimated row count: 1
│
├── • values
│ size: 2 columns, 1 row
│
└── • scan
estimated row count: 1 (100% of the table; stats collected 9 minutes ago)
table: foo@foo_pkey
spans: [/'fe0abe6b-f07a-43c4-a82f-9258a6ed89cc' - /'fe0abe6b-f07a-43c4-a82f-9258a6ed89cc']
If we want to see the first WHERE
in action, we need to add a partial unique index that matches the predicate, and we need to change the row to match that predicate:
CREATE UNIQUE INDEX ON foo (data) WHERE data = 2;
UPDATE foo SET data = 2 WHERE id = 'fe0abe6b-f07a-43c4-a82f-9258a6ed89cc';
Now if we change the row we're inserting to have data=2, and change ON CONFLICT (id) WHERE data = 2
to ON CONFLICT (data) WHERE data = 2
we pick the partial unique index as the arbiter:
demo@127.0.0.1:26257/demoapp/defaultdb> EXPLAIN INSERT INTO foo (id, data) VALUES ('fe0abe6b-f07a-43c4-a82f-9258a6ed89cc', 2) ON CONFLICT (data) WHERE data = 2 DO UPDATE SET data = 3;
info
--------------------------------------------------------------------------------------------------
distribution: local
vectorized: true
• upsert
│ into: foo(id, data)
│ auto commit
│ arbiter indexes: foo_data_key <- using the partial unique index as arbiter
│
└── • render
│
└── • render
│
└── • cross join (left outer)
│ estimated row count: 1
│
├── • values
│ size: 2 columns, 1 row
│
└── • scan
estimated row count: 0 (<0.01% of the table; stats collected 52 seconds ago)
table: foo@foo_data_key (partial index)
spans: FULL SCAN
locking strength: for update
(23 rows)
demo@127.0.0.1:26257/demoapp/defaultdb> INSERT INTO foo (id, data) VALUES ('fe0abe6b-f07a-43c4-a82f-9258a6ed89cc', 2) ON CONFLICT (data) WHERE data = 2 DO UPDATE SET data = 3;
INSERT 0 1
demo@127.0.0.1:26257/demoapp/defaultdb> SELECT * FROM foo;
id | data
---------------------------------------+-------
fe0abe6b-f07a-43c4-a82f-9258a6ed89cc | 3
(1 row)
It does seem like we should throw an error if we're not finding a partial unique index matching the predicate as an arbiter.
Thank you for explaining the difference!
Based on what you said, yeah, I think throwing an error would be ideal. (I wonder how many bugs around this are lurking in various systems.)
It does seem like we should throw an error if we're not finding a partial unique index matching the predicate as an arbiter.
I believe the lack of an error is the intended behavior. The reason CRDB and Postgres don't throw errors is because the primary index is a valid arbiter that can detect conflicts for the INSERT .. ON CONFLICT .. WHERE
statement. A unique, non-partial index holds columns unique over all subsets of rows, so it can be used as an arbiter for an INSERT .. ON CONFLICT ..
with any WHERE
clause (even WHERE false
!).
So this is something to keep in mind with an INSERT .. ON CONFLICT .. WHERE
statement. It won't look for arbiter indexes with partial index predicates that exactly match the insert's WHERE
clause. Instead, it will look for arbiter indexes with partial index predicates that are implied by the insert's WHERE
clause, or, in other words, that are guaranteed to contain all the rows matching the insert's WHERE
clause. In this case, the primary index is essentially a "partial index" with a predicate WHERE true
which is implied by all possible expressions.
I do agree that we should error in the case where a column in the insert's WHERE
clause does not exist. That's a bug.
Quick follow-up:
The Postgres docs make the behavior clear:
index_predicate — Used to allow inference of partial unique indexes. Any indexes that satisfy the predicate (which need not actually be partial indexes) can be inferred. Follows CREATE INDEX format. SELECT privilege on any column appearing within index_predicate is required.
Our docs are not clear on this. In fact, I don't even see a mention of this ON CONFLICT .. WHERE
variant of INSERT
. I've created a docs issue for this: https://github.com/cockroachdb/docs/issues/18292
It won't look for arbiter indexes with partial index predicates that exactly match the insert's WHERE clause. Instead, it will look for arbiter indexes with partial index predicates that are implied by the insert's WHERE clause, or, in other words, that are guaranteed to contain all the rows matching the insert's WHERE clause. In this cause, the primary index is essentially a "partial index" with a predicate WHERE true which is implied by all possible expressions.
Ah, ok, this explains why we weren't picking an index created with CREATE UNIQUE INDEX ON foo (id) WHERE data = 2
in my experiments! Thanks!
Ah, ok, this explains why we weren't picking an index created with CREATE UNIQUE INDEX ON foo (id) WHERE data = 2 in my experiments! Thanks!
Exactly. There's a special case "fast path" where if we have a non-partial index for the ON CONFLICT
columns then we use only that index as the arbiter because it's all we need to catch conflicts across all subsets of rows. If we have only partial unique indexes, we use all that match to ensure that we catch all conflicts across all subsets that are held to uniqueness of the columns. That's explained more here: https://github.com/cockroachdb/cockroach/blob/fb58047a7315330ed844340b6889ab42445eb03e/pkg/sql/opt/optbuilder/mutation_builder_arbiter.go#L129-L153
I'm definitely not an expert around Postgres, but from my perspective, not erroring out or even producing a warning for an ignored clause is pretty rough. It's also been completely new and surprising to my coworkers. (But I'll defer to y'all of course.)
I don't thinks it's accurate to say that the WHERE
clause is ignored, but I agree this is dreadfully confusing. The WHERE
clause here is never used to filter rows. It's only used to allow the system to infer arbiter indexes which can detect conflicts. This example shows how it's used:
CREATE TABLE t (
i INT,
b BOOL
);
CREATE UNIQUE INDEX i1 ON t(i) WHERE b;
CREATE UNIQUE INDEX i2 ON t(i) WHERE NOT b;
INSERT INTO t VALUES (1, true);
-- This INSERT succeeds. With the "WHERE b" clause the system infers i1 as an
-- arbiter to detect and avoid conflicts, so the conflict of (1, true) is
-- detected and an UPDATE is performed instead of an INSERT. The (1, false) row
-- does not conflict, so an INSERT is performed.
INSERT INTO t VALUES (1, true), (1, false) ON CONFLICT (i) WHERE b DO UPDATE SET i = 11;
SELECT * FROM t;
-- i | b
-- -----+----
-- 11 | t
-- 1 | f
-- (2 rows)
-- If we perform a similar INSERT again, it fails. The conflict of (1, false) is
-- not detected because "WHERE b" does not allow i2 to be used as an arbiter
-- ("WHERE b" does not imply "WHERE NOT b"). As a result, the system attempts to
-- INSERT (1, false), which fails.
INSERT INTO t VALUES (1, true), (1, false) ON CONFLICT (i) WHERE b DO UPDATE SET i = 111;
-- ERROR: duplicate key value violates unique constraint "i2"
-- SQLSTATE: 23505
-- DETAIL: Key (i)=(1) already exists.
-- CONSTRAINT: i2
There's added strangeness when you use this WHERE
clause and a non-partial unique index exists:
DROP TABLE IF EXISTS t;
CREATE TABLE t (
i INT,
b BOOL
);
CREATE UNIQUE INDEX i1 ON t(i);
CREATE UNIQUE INDEX i2 ON t(i) WHERE b;
INSERT INTO t VALUES (1, false);
-- This INSERT succeeds. The system infers the non-partial unique index i1 as
-- the arbiter, so a conflict is detected and an UPDATE is performed instead of
-- an INSERT. We might reasonably expect this to only detect conflicts when the
-- INSERT's value for b is true and for the INSERT to fail.
INSERT INTO t VALUES (1, false) ON CONFLICT (i) WHERE b DO UPDATE SET i = 11;
SELECT * FROM t;
-- i | b
-- ----+---
-- 11 | f
-- (1 row)
I can't provide a very reasonable argument for why the INSERT
in the example above succeeds. The strongest argument I can see is that:
WHERE
clause in the INSERT
implies the unique index's WHERE
clause.i1
can be thought of as a partial index with the predicate WHERE true
.WHERE b
, imply WHERE true
.i1
can be inferred as an arbiter index.This matches the quote from Postgres's documentation in my earlier response:
index_predicate — Used to allow inference of partial unique indexes. Any indexes that satisfy the predicate (which need not actually be partial indexes) can be inferred.
But this justification sounds like an implementation detail. It doesn't appear to match the declarative nature of SQL. It feels more like we're telling the system how to do something than telling the system what the logical outcome should be.
Perhaps this behavior is more accidental than intentional, and Postgres (and now CockroachDB) are bound to maintain it. I can imagine that with the addition of partial indexes there was a new need to be able to differentiate between conflicts of multiple partial indexes. This method of inference with a WHERE
clause was probably chosen because it closely matches how the system determines if a partial index can be used to satisfy a query. And the behavior in the presence on non-partial index is perhaps an unintended side-effect.
We could add a WARNING
notice if an INSERT .. ON CONFLICT
has a WHERE
clause and a non-partial arbiter is selected. I think in that case it's reasonable to assume that the user's expected behavior and the statement's actual behavior may not match.
Having spent more time experimenting, I agree with what you said. I think, at the very least, producing a warning if ON CONFLICT ... WHERE
is specified with a full index would be great.
FWIW I wrote up my findings here. We've made it so that we no longer permit ON CONFLICT ... WHERE
generally, and for partial indexes we use WHERE false
to make crdb automatically determine the right bounds. We don't have any need for more specific bounds than the one specified on the index, so it works out.
Thank you for sharing your notes. I learned quite a few things!
It's interesting that the WHERE false
trick does not work in Postgres. My guess is that their "implication prover" doesn't understand that false
implies everything because it's useless in practice when proving whether or not a partial index can be used in a SELECT
. If the filter is false
then there's no need to scan any index - the result set is always empty.
I do have some bad news, unfortunately. The WHERE false
trick will not work in CockroachDB if:
INSERT
has ON CONFLICT .. DO UPDATE
form (DO NOTHING
is fine).You'll run into the limitation described in https://github.com/cockroachdb/cockroach/issues/53170 and the insert will result in an error. For example:
CREATE TABLE t (
i INT,
j INT,
b BOOL
);
CREATE UNIQUE INDEX i1 ON t(i, j) WHERE NOT b;
CREATE UNIQUE INDEX i2 ON t(j, i) WHERE b;
INSERT INTO t VALUES (1, 2) ON CONFLICT (i, j) WHERE false DO UPDATE SET i = 11;
-- ERROR: unimplemented: there are multiple unique or exclusion constraints matching the ON CONFLICT specification
-- SQLSTATE: 0A000
-- HINT: You have attempted to use a feature that is not yet implemented.
-- See: https://go.crdb.dev/issue-v/53170/v23.1
-- If we add a non-partial unique index, the INSERT succeeds because the
-- non-partial index is selected as the sole arbiter.
CREATE UNIQUE INDEX i0 ON t(j, i);
INSERT INTO t VALUES (1, 2) ON CONFLICT (i, j) WHERE false DO UPDATE SET i = 11;
-- INSERT 0 1
There is some comfort in the fact that you'll always run into this error given the conditions above, regardless of the insert values. So I think you'd likely catch this during development or testing, as long as you have coverage of these types of INSERT
s.
Thank you so much for the further investigation -- this is great additional context that I should add to those notes.
I do consider the ON CONFLICT ... WHERE false
thing, while clever, to be a hack that I'd like to get rid of long-term. A warning (maybe being able to opt-in to making it an error) if a ON CONFLICT ... WHERE
clause is passed in with a full index would probably get us most of the way there. There could also be some kind of basic relevance check for partial indexes, though the exact details may have to be worked out.
We do test all of our SQL flows against a real CockroachDB instance, so hopefully future changes in the semantics will be caught early.
Out of curiosity, do you use a SAT or SMT solver for implication proving?
I've adjusted the title of this issue to be focused on a warning or guard against misuse with ON CONFLICT
and partial indexes so that all the context in this thread remains. I've created a new issue to track the bug that allows invalid references in WHERE
expressions: #119389.
Out of curiosity, do you use a SAT or SMT solver for implication proving?
No, we do something simpler. You can find more details here.
I have little knowledge of SAT and SMT solvers, but I believe that they are more robust than our implication proving and have greater time/space complexity. We can tolerate the occasional false negative—in the worst case this results in a suboptimal query plan, not incorrect results. Speed is critical because we must prove implication N = num_partial_indexes_in_referenced_tables * num_query_plan_transformations_that_could_use_a_partial_index
times during optimization-time of a query.
Thanks for the reference! That was an interesting read.
While SAT is theoretically NP-complete, and many of the theories of SMT solvers are undecidable -- for typical queries, they've been heavily optimized and tend to be reasonably fast. They also have a notion of "resource limits" (abstract instruction limits), so you can timeout (and produce a negative) if a query takes too long.
This is based on the SMT solver I've used the most, Z3. On my machine (Ryzen 7950x, Linux 6.6.10), this simple script using Z3Py:
#!/usr/bin/env python
from z3 import *
import time
start = time.monotonic()
x = Int('x')
y = Int('y')
s = Solver()
s.add(Not(Implies(And(x > 5, y > 2), x > 10)))
r = s.check()
print(r)
print(s.statistics())
end = time.monotonic()
print("Time:", end - start)
shows:
% ./foo.py
sat
(:elim-unconstrained 2
:max-memory 29.21
:memory 20.51
:num-allocs 1057327
:rlimit-count 313
:sat-decisions 4
:sat-mk-clause-2ary 2
:sat-mk-var 4
:solve-eqs-elim-vars 1
:solve-eqs-steps 1
:time 0.01)
Time: 0.016529261134564877
So that took around 16ms to execute (and some of that would just be Python overhead).
Anyway, this is out of scope for the immediate issue, just thought it might be interesting. And thanks again for being open to suggestions here, it's really appreciated.
Describe the problem
Postgres offers two ways to express conditional upserts with
INSERT INTO
:ON CONFLICT ... WHERE ... DO UPDATE
ON CONFLICT ... DO UPDATE ... WHERE
(Though the Postgres docs say in style 1, the
WHERE
only takes anindex_predicate
, which I'm not sure what the difference is)Most users use style 2, which works fine. However, with 1, the condition in the
WHERE
clause is silently dropped.We hit this because with the Rust Diesel ORM, the only way to express such a query appears to be with style 1.
We're tracking this in https://github.com/oxidecomputer/omicron/issues/5047.
To Reproduce
Start up a single-node cluster:
Then, connect to it with
cockroach sql
in another terminal and issue the following commands:The output of the
SELECT
command is:It is even possible to pass in a column that does not exist:
Expected behavior
I expected to either see
data = 1
for the row, or for the query to be rejected.Environment:
cockroach sql
, as documented aboveJira issue: CRDB-36038