This adds a durable saga that writes to spicedb and kube, with the goal of ensuring that a write happens in both, or neither, but not just one or the other.
There are two methods of writing implemented: a pessimistic lock that prevents other requests from attempting to create same object at the same time, and an optimistic lock that detects when there are conflicts and rolls back or forward as needed.
Pessimistic outline:
A createnamespace foo call comes in from user:evan
Compute a workflow hash i.e. xxhash(create, namespace, foo)
If the SpiceDB write fails, fail the workflow (return error to user if not async)
In a loop:
Attempt to write to kube
If write succeeds, remove workflow lock tuple, return success to user
If kube resp is IsAlreadyExists, remove workflow lock tuple, return success to user (lock tuple ensures no one else did this write - assuming all traffic is going through the proxy)
If kube resp is any other error, return error to user, remove both tuples, return error to user
If there is some other error where the kube response can't be retrieved, continue the loop
Optimistic outline:
A createnamespace foo call comes in from user:evan
Write record to SpiceDB
If SpiceDB write fails, fail the request. The client can retry / fix the error, and no data has been written to either.
If the SpiceDB write succeeds, but the proxy sees the step as failed (i.e. because the process failed), the write is rolled back, and an error is returned to the user to try again.
Write record to Kube
If Kube write fails:
Check to see if object already exists in Kube:
If so, work is done.
If object does not exist, revert the SpiceDB write.
There are pros and cons to each approach, for now both are supported and we can configure them per request type or per instance of the proxy.
The durability of this function means that inputs, outputs, and progress state are stored in a sqlite database. The goal is to be robust to service failures (SpiceDB and Kube API) and process failures (network dies, process crashes and restarts).
The tests make use of failpoints to inject faults at specific places, and then verify that either both writes effectively happened, or neither did.
This initial implementation just deals with namespace objects but should be fairly straightforward to make generic for other types. I'm assuming we'll spend time on that in #6.
Closes https://github.com/authzed/spicedb-kubeapi-proxy/issues/3
This adds a durable saga that writes to spicedb and kube, with the goal of ensuring that a write happens in both, or neither, but not just one or the other.
There are two methods of writing implemented: a pessimistic lock that prevents other requests from attempting to create same object at the same time, and an optimistic lock that detects when there are conflicts and rolls back or forward as needed.
Pessimistic outline:
create
namespace foo
call comes in fromuser:evan
xxhash(create, namespace, foo)
workflow:xxhash(create,namespace,foo)#id@workflow_id:caca56e8-388b-46ca-bf2a-7fe325defe68
namespace:foo#creator@user:evan
operation
: OPERATION_MUST_NOT_MATCHfilter
: `workflow:xxhash(create,namespace,foo)#id@workflow_id:*Optimistic outline:
create
namespace foo
call comes in fromuser:evan
There are pros and cons to each approach, for now both are supported and we can configure them per request type or per instance of the proxy.
The durability of this function means that inputs, outputs, and progress state are stored in a sqlite database. The goal is to be robust to service failures (SpiceDB and Kube API) and process failures (network dies, process crashes and restarts).
The tests make use of failpoints to inject faults at specific places, and then verify that either both writes effectively happened, or neither did.
This initial implementation just deals with
namespace
objects but should be fairly straightforward to make generic for other types. I'm assuming we'll spend time on that in #6.