Transactional/ACID semantics

I have a general question in regard to:

Natively supports multiple storage drivers, including Google Cloud Storage, local and network filesystems, in-memory storage.

Support for read/writeback caching and transactions, with strong atomicity, isolation, consistency, and durability (ACID) guarantees.

and this sentence in the Blog:

Safety of parallel operations when many machines are accessing the same dataset is achieved through the use of optimistic concurrency, which maintains compatibility with diverse underlying storage layers (including Cloud storage platforms, such as GCS, as well as local filesystems) without significantly impacting performance. TensorStore also provides strong ACID guarantees for all individual operations executing within a single runtime.

I created a dummy dataset with the zarr + S3 drivers:

2024-04-04 15:33:22        230 ts/yang-test-dataset/.zarray
2024-04-04 16:43:54      48573 ts/yang-test-dataset/0.0.0
2024-04-04 16:43:54      48573 ts/yang-test-dataset/0.0.1
2024-04-04 16:43:54      48573 ts/yang-test-dataset/0.0.2
2024-04-04 16:43:54      48573 ts/yang-test-dataset/0.0.3
2024-04-04 16:43:54      48573 ts/yang-test-dataset/0.0.4
2024-04-04 16:43:54      48573 ts/yang-test-dataset/0.0.5
2024-04-04 16:43:54      48573 ts/yang-test-dataset/0.0.6
2024-04-04 16:43:54      48573 ts/yang-test-dataset/0.0.7
2024-04-04 16:43:54      48573 ts/yang-test-dataset/0.0.8
2024-04-04 16:43:54      48573 ts/yang-test-dataset/0.0.9

and then created a situation where the next write to chunk 0.0.3 would fail. Running under a transaction

with ts.Transaction() as txn:
    result = ds.with_transaction(txn)[80:82, 99:102, :] = [[[1],[2],[3]], [[4], [5], [6]]]

would throw

Traceback (most recent call last):
  File "/home/yang.yang/workspaces/tensorstore/.yang/foo.py", line 33, in <module>
    with ts.Transaction() as txn:
ValueError: PERMISSION_DENIED: Error writing "ts/yang-test-dataset/0.0.3": HTTP response code: 403 with body: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>SK6BWG5ESTC2NVJ6</RequestId><HostId>5z3/QZmVne5TyFJUH0A0swSAtyyhsl47I/z7AjULiGmsj1QAtf3JEA6d/TAuWH/ts1xCHJmVucM=</HostId></Error> [source locations='tensorstore/kvstore/s3/s3_key_value_store.cc:777\ntensorstore/kvstore/kvstore.cc:373'

but the S3 bucket after this operation looks like this:

2024-04-04 15:33:22        230 ts/yang-test-dataset/.zarray
2024-04-04 17:14:57      48573 ts/yang-test-dataset/0.0.0
2024-04-04 17:14:58      48573 ts/yang-test-dataset/0.0.1
2024-04-04 17:14:58      48573 ts/yang-test-dataset/0.0.2
2024-04-04 16:43:54      48573 ts/yang-test-dataset/0.0.3  <--- not updated
2024-04-04 17:14:57      48573 ts/yang-test-dataset/0.0.4
2024-04-04 17:14:58      48573 ts/yang-test-dataset/0.0.5
2024-04-04 17:14:57      48573 ts/yang-test-dataset/0.0.6
2024-04-04 17:14:58      48573 ts/yang-test-dataset/0.0.7
2024-04-04 17:14:57      48573 ts/yang-test-dataset/0.0.8
2024-04-04 17:14:57      48573 ts/yang-test-dataset/0.0.9

So from the perspective of an observer (who may eventually want to load this dataset again), the operation does not appear to be transactional. So when the blog says transactional with a single runtime, do you mean that the process's view of ds when the context manager exits is transactional, but otherwise make no guarantees about the state of the underlying storage?

If one sets

with ts.Transaction(atomic=True) as txn:
    ...

then if a write would span multiple chunks, I see an error

ValueError: Cannot read/write "ts/yang-test-dataset/.zarray" and read/write "ts/yang-test-dataset/0.0.0" as single atomic transaction [source locations='tensorstore/internal/cache/kvs_backed_cache.h:221\ntensorstore/internal/cache/async_cache.cc:660\ntensorstore/internal/cache/async_cache.h:383\ntensorstore/internal/cache/chunk_cache.cc:438\ntensorstore/internal/grid_partition.cc:246\ntensorstore/internal/grid_partition.cc:246\ntensorstore/internal/grid_partition.cc:246']

I'm guessing this is expected since you have no way of performing a transactional write across multiple S3 objects?

Lastly, on the topic of "optimistic concurrency and compatibility with GCS/other storage layers", since AFAIK S3 does not support conditional PUTs the way that GCS does, is there a possibility of data loss when using S3?

Thanks in advance!

google / tensorstore

Transactional/ACID semantics #150