Open olavloite opened 3 years ago
Thanks for pointing this out. We are looking into the issue.
@Vizerai Any updates? We are running into this as well
I'm facing the same issue. any update?
We're hitting the same issue and it makes it really hard to work with & trust the emulator in our tests.
The Cloud Spanner Emulator team is currently working on the fix of this feature, and they reprioritized it internally to try to to get an initial implementation for this feature in Q4 2022 (approximate ETA, it can take a bit longer), and possibly add it to the next emulator release. More updates on how to use this feature in the emulator will be shared once the fix is released. Thank you for your understanding.
You can keep track of this fix in the Public Issue Tracker page. If you click on the "star", you'll receive an email as soon as there's any change published.
Hi, looks like this been fixed now: https://issuetracker.google.com/issues/231624384#comment6 ~Verified it, and it works for me now. Can probably close this issue?~ Nevermind, it's just a workaround that disables the check altogether.
Added a comment in the issue tracker too: https://issuetracker.google.com/issues/231624384#comment7
Hi, looks like this been fixed now: https://issuetracker.google.com/issues/231624384#comment6 ~Verified it, and it works for me now. Can probably close this issue?~ Nevermind, it's just a workaround that disables the check altogether.
Added a comment in the issue tracker too: https://issuetracker.google.com/issues/231624384#comment7
Can't go back and try the workaround (https://issuetracker.google.com/issues/231624384#comment5). Afraid of missing so many features and fixes.
Is there any ETA on this?
Okay, I've figured that out.
The root cause of the issue is the fact that internal reads made by ZetaSQL engine during DML query evaluation used the same ReadWriteTransaction::Read
method which throws when there is an attempt to read a column with pending commit timestamp.
I came up with a fix for this in our fork here: commit. It may not cover all the possible cases but it now allows execution of multiple DML queries (inserts/updates) with pending commit timestamp within the same transaction.
If folks here are interested, we have also built the emulator with this fix (and some others), here are the release notes with a docker container (linux/amd64
only) link in Docker Hub: https://github.com/nutshelllabs/cloud-spanner-emulator/releases/tag/v1.5.14.5
Not sure who I need to tag here from Google team (@arunmoezhi ?), but I can open a PR (even though you're not accepting contributions directly) with the same commit separately if you guys are interested in incorporating (or in getting an inspiration of) the change into the future release
Not sure who I need to tag here from Google team (@arunmoezhi ?), but I can open a PR (even though you're not accepting contributions directly) with the same commit separately if you guys are interested in incorporating (or in getting an inspiration of) the change into the future release
@kberezin-nshl Thanks for the suggested fix. I'll relay it to the people working on this part of the emulator. One of the reasons that we cannot accept contributions directly, is that the code here on GitHub is just an export from an internal repository. But the commit itself should be a good source of inspiration.
@kberezin-nshl Just a quick response from the development team working on this feature: The suggested fix would disable all checks for reading columns that might have been updated with pending_commit_timestamp()
. That means that the emulator would be more permissive than Cloud Spanner. It is of course a fine workaround for the specific problem, but you should be aware that with this patch you could potentially have tests that succeed on the emulator, but that fail on Cloud Spanner.
@olavloite thank you for the feedback.
The suggested fix would disable all checks for reading columns that might have been updated with pending_commit_timestamp()
That is not exactly true. The suggested fix disables them only during DML statements execution (INSERT or UPDATE queries). SELECT queries and READ requests will still do the checks.
That means that the emulator would be more permissive than Cloud Spanner.
That might be the case, yes, but I think I have checked most cases locally and, even though you potentially can use column which have pending_commit_timestamp()
set in your DML queries, you still won't be able to actually read the results back (by using THEN RETURN
statement or via direct read). Obviously, I am not 100% confident what the real Spanner's behavior is in all circumstances, but we felt like it was good enough for us.
Here are the tests I ran against real Spanner and my patched version (obviously, within the same transaction):
INSERT INTO test(id, modified_timestamp, another_timestamp)
VALUES (2, PENDING_COMMIT_TIMESTAMP(), CURRENT_TIMESTAMP());
-- this is ok (even though modified_timestamp is set to pending)
UPDATE test SET another_timestamp = modified_timestamp WHERE id = 2;
-- this will obviously fail
SELECT * FROM test WHERE id = 2;
-- this will fail too, if you try to read the returned cursor
INSERT INTO test(id, modified_timestamp)
VALUES (999, PENDING_COMMIT_TIMESTAMP())
THEN RETURN modified_timestamp;
Is there an ETA for when a fix for this might be released?
@olavloite hey, did dev team have a chance to reconsider inclusion of this fix internally?
@olavloite thanks for bringing in the update to cover this issue in 1.5.17
, I tested that locally and I can confirm it fixes it for multiple INSERT
s in a transaction, but unfortunately it still fails if you have multiple UPDATE
s.
Pseudo-code:
// This transaction below works now
spanner.executeInTransaction(
"""
INSERT INTO test_with_timestamp(id, data, modified_timestamp)
VALUES(100, 'foo', PENDING_COMMIT_TIMESTAMP())
THEN RETURN data;
INSERT INTO test_with_timestamp(id, data, modified_timestamp)
VALUES(200, 'foo', PENDING_COMMIT_TIMESTAMP())
THEN RETURN data;
"""
);
// but this does not
spanner.executeInTransaction(
"""
-- this statement is OK
UPDATE test_with_timestamp
SET data = 'bar', modified_timestamp = PENDING_COMMIT_TIMESTAMP()
WHERE id = 100
THEN RETURN data;
-- statement below fails
UPDATE test_with_timestamp
SET data = 'bar', modified_timestamp = PENDING_COMMIT_TIMESTAMP()
WHERE id = 200
THEN RETURN data;
"""
);
@kberezin-nshl can you share the schema you are testing with? Your examples execute successfully for me on 1.5.17 using the following definition of test_with_timestamp
,
CREATE TABLE test_with_timestamp(
id int64,
data string(max),
modified_timestamp timestamp OPTIONS (allow_commit_timestamp = TRUE)
) PRIMARY KEY(id);
Do you perhaps have an index on modified_timestamp
? If so, an error would be expected from the second UPDATE
. This is intentional and consistent with Spanner.
@c2nes hey, sorry for the late reply. I had to play around with that to figure out what causing the failure on my side, because indeed my schema was slightly different.
Basically the root cause is an additional STORED
column which references any other column.
So this schema should fail:
CREATE TABLE test_with_timestamp(
id int64,
data string(max),
modified_timestamp timestamp OPTIONS (allow_commit_timestamp = TRUE),
original INT64 DEFAULT (100),
original_plus_one INT64 AS (original + 1) STORED, -- <<-- this fails
one_plus_one INT64 AS (1 + 1) STORED, -- <<-- this does not
) PRIMARY KEY(id);
as soon as I removed STORED
column, I can confirm that multiple updates in a row execute successfully.
Thanks @kberezin-nshl I'll take a look
@kberezin-nshl thanks for you patience. I was able to reproduce your issue and have implemented a fix which should be included in the next emulator release.
@kberezin-nshl the latest 1.5.19 emulator release includes the relevant fix. Please let me know if you still see any commit timestamp related issues.
@c2nes great! I couldn't find any issues with the latest releases, thank you!
I'm still hitting this issue with the latest release except with indexes.
The create table works fine:
CREATE TABLE Tasks (
TaskId STRING(36) NOT NULL DEFAULT (GENERATE_UUID()),
CreatedAt TIMESTAMP NOT NULL OPTIONS (allow_commit_timestamp = true),
UpdatedAt TIMESTAMP NOT NULL OPTIONS (allow_commit_timestamp = true),
ExecuteAt TIMESTAMP NOT NULL,
AttemptedAt TIMESTAMP,
ExpiresAt TIMESTAMP AS (
IF
(AttemptedAt IS NOT NULL,
IF
(State = "TASK_STATE_EXECUTED", TIMESTAMP_ADD(AttemptedAt, INTERVAL 7 DAY),
IF
(State = "TASK_STATE_FAILED", TIMESTAMP_ADD(AttemptedAt, INTERVAL 90 DAY), NULL)), NULL)) STORED,
Executor JSON,
SpanContext JSON,
State STRING(MAX) NOT NULL DEFAULT("TASK_STATE_PENDING"),
ExternalId String(Max),
Attributes JSON)
PRIMARY KEY (TaskId),
ROW DELETION POLICY (OLDER_THAN(ExpiresAt, INTERVAL 0 DAY));
As soon as I try to create an index I get Column CreatedAt cannot be accessed because it, or its associated index, has a pending CommitTimestamp
CREATE INDEX TasksByExecuteAt
ON Tasks(TaskId, ExecuteAt DESC) STORING (
CreatedAt,
UpdatedAt,
AttemptedAt,
ExpiresAt,
Executor,
SpanContext,
ExternalId,
State,
Attributes);
I have a table with a
TIMESTAMP
column withOPTIONS (allow_commit_timestamp=true)
. If I try to execute twoINSERT
statements on this table with aPENDING_COMMIT_TIMESTAMP()
value for the commit timestamp column, the second statement will fail with aINVALID_ARGUMENT: Column LastModified cannot be accessed because it, or its associated index, has a pending CommitTimestamp
error message.Example code that reproduces the problem:
The same behavior is also observed if the two statements are sent as one
BatchDML
request.The same operation does work on a real Spanner instance.