gchq / sleeper

A cloud-native, serverless, scalable, cheap key-value store
Apache License 2.0
64 stars 10 forks source link

File references DynamoDB table to be updated based on transactions #3586

Open patchwork01 opened 3 weeks ago

patchwork01 commented 3 weeks ago

Background

Split from:

Description

We'd like a DynamoDB table that can handle the file reference queries needed against the state store, but can be updated based on transactions.

We will follow up with a lambda that will actually update this based on this code.

Analysis

We can create a DynamoDB table of file references, with a hash key of Sleeper table ID and partition ID, and a range key of transaction number.

We can create a class that wraps this table and implements the operations needed for the state store.

To replicate transaction isolation, we could query based on a certain time.

We need to ensure the transactions are applied to the table in the same order as they were applied to the transaction log. Check how we would do this.

We could handle wiring this into the state store in a separate issue, as we may need to adjust the interfaces. This will only be relevant for the TransactionLogFileReferenceStore, so that may be the right place to wire this in. Alternatively, we could start by testing against the transaction log state store implementation.

patchwork01 commented 2 weeks ago

The PR was linked to the wrong issue, reopening.