Open danhhz opened 7 years ago
Docs todo: Add note to UPSERT
docs stating that, if you specify all columns and there are no secondary indexes, UPSERT
will be faster than the equivalent INSERT .. ON CONFLICT
(won't read before writing).
Note: This may be interesting to users simulating kv storage via a 2-column SQL table.
Docs note added with https://github.com/cockroachdb/docs/pull/1269.
Hey Matt, this seems relevant to earlier discussion.
Oh nice. Yes I will take this.
Hey @RaduBerinde, this seems optimizer-related. Does it make sense to keep open?
Yeah, this is still true. These two statements are equivalent but have different plans:
> create table abc (a int primary key, b int, c int);
> explain upsert into abc values (1,1,1),(2,2,2);
tree | field | description
------------------------+-------------+--------------------
| distributed | false
| vectorized | false
count | |
└── upsert | |
│ | into | abc(a, b, c)
│ | strategy | opt upserter
│ | auto commit |
└── render | |
└── values | |
| size | 3 columns, 2 rows
(10 rows)
Time: 488.44µs
> explain insert into abc values (1,1,1),(2,2,2) on conflict(a) do update set (b,c)=(excluded.b,excluded.c);
tree | field | description
-----------------------------+--------------------+--------------------
| distributed | false
| vectorized | false
count | |
└── upsert | |
│ | into | abc(a, b, c)
│ | strategy | opt upserter
│ | auto commit |
└── render | |
└── hash-join | |
│ | type | left outer
│ | equality | (column1) = (a)
│ | right cols are key |
├── values | |
│ | size | 3 columns, 2 rows
└── scan | |
| table | abc@primary
| spans | ALL
Supporting an INSERT ON CONFLICT DO NOTHING might make a good starter or intern project. The semantics are similar to UPSERT, but instead of updating the existing row when there is a conflict, INSERT ON CONFLICT DO NOTHING succeeds without performing any writes. To implement this I think we could do something like: CPut with a nil expected value, and if the key already existed swallow the error.
We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!
Supporting an INSERT ON CONFLICT DO NOTHING might make a good starter or intern project. The semantics are similar to UPSERT, but instead of updating the existing row when there is a conflict, INSERT ON CONFLICT DO NOTHING succeeds without performing any writes. To implement this I think we could do something like: CPut with a nil expected value, and if the key already existed swallow the error.
This does seem like a great first part to bite off. Not promising, but we can at least consider this during 24.1 planning.
marcus=# CREATE TABLE t (k INT PRIMARY KEY, a INT);
CREATE TABLE
marcus=# INSERT INTO t VALUES (1, 10);
INSERT 0 1
marcus=# INSERT INTO t VALUES (2, 20), (1, 11) ON CONFLICT DO NOTHING;
INSERT 0 1
marcus=# SELECT * FROM t;
k | a
---+----
1 | 10
2 | 20
(2 rows)
I think my prior comment was a bit naive. This will be complicated for multi-row inserts on tables with multiple indexes because we'd have to keep track of which KVs correspond to which insert rows, and retroactively delete keys for a row if one of the keys failed to CPut. I'm uneasy about the complexity of that.
Maybe for starters we can limit to single-row inserts and/or tables with only a primary index.
The fast path is currently only enabled when the UPSERT alias is explicitly selected by the user. It's possible to fast path some queries of the form INSERT ... ON CONFLICT, but the utility is low and there are lots of edge cases (that caused real correctness bugs #13437 #13962). As a result, we've decided to remove this until after 1.0 and re-enable it then.
TODO(dan): Once it's merged, link to the PR that turned this off so the code can be resurrected.
Jira issue: CRDB-6104
gz#19028