microsoft / durabletask-java

Java SDK for Durable Functions and the Durable Task Framework
MIT License
13 stars 7 forks source link

Purge instance support #10

Closed cgillum closed 2 years ago

cgillum commented 2 years ago

Requirement

Orchestration clients should be able to purge orchestration data, as described here. Purging single instances and multiple instances using filters should be supported.

One or more integration tests covering this scenario should also be added.

Implementation notes

Note that there's an equivalent work item being tracked to add this for .NET Isolated support. This would be a good reference for the Java work-item, if available.

Suggested implementation strategy

The Durable Task sidecar works by making API calls into the Durable Task Framework (DTFx). Unfortunately, DTFx doesn't support a "purge" API as part of its core abstraction. Rather, this was added only to specific DTFx backends. To implement this feature correctly, we need to make purge a first-class feature of DTFx. Here are some suggestions for how to do this:

  1. Add a new public IOrchestrationServicePurgeClient interface to the DurableTask.Core project and define a PurgeInstanceStateAsync method:
public interface IOrchestrationServicePurgeClient
{
    void Task<PurgeHistoryResult> PurgeInstanceStateAsync(string instanceId);
}
  1. Update the AzureStorageOrchestrationService class in Azure/durabletask (the same repo) to implement this new interface and map it to the existing PurgeInstanceHistoryAsync method.

  2. Update the SqlOrchestrationService class in microsoft/durabletask-mssql to implement this new interface and map it to the existing PurgeOrchestrationHistoryAsync method.

  3. Update the NetheriteOrchestrationService class in microsoft/durabletask-netherite to implement this new interface and map it to the existing PurgeInstanceHistoryAsync method.

It's not strictly necessary for us to do this work for all backends at once. We can start with (1) and skip ahead to the next section so we can more quickly get to integration testing and validate the design. I suggest we open GitHub issues to separately track the different steps above.

  1. Update the InMemoryOrchestrationService class in the microsoft/durabletask-sidecar repo to implement this new interface. There is no existing method that implements similar functionality, so an implementation will need to be written from scratch. This is an in-memory storage provider so it's fine to just remove the appropriate objects in InMemoryInstanceStore.store. This is the storage provider used by integration tests so this implementation should probably be the first one to work on.

  2. Uncomment and finish implementing the DeleteInstance gRPC definition. For local development, these changes can be done directly in the sub-module under the microsoft/durabletask-sidecar project. This API will need to match the API defined in the new IOrchestrationServicePurgeClient method mentioned above.

  3. Using the updated gRPC definition, add a new overload method to the sidecar's TaskHubGrpcServer class. This method will take the IOrchestrationServiceClient object (which should already be accessible) and try to cast it into IOrchestrationServicePurgeClient. If the cast succeeds, then invoke the new method. Otherwise, throw a new NotSupportedException with a helpful error message.

  4. In the Java SDK, add two new methods to DurableTaskClient with a matching method signature for the single-instance purge API and another for the multi-instance purge API. The implementations should both call the new gRPC method that was defined in a previous step. The multi-instance purge API should internally use the new query API to select all the instances that need to be deleted (in pages) and issue individual deletes, ideally in parallel (but with some limit on the amount of parallelism to avoid thread starvation). There should also be a parameter for either implementing a timeout or cancellation since this operation could be very slow. The exact design should be based on what's most familiar to Java developers. In the future, we can consider optimizing the multi-instance purge API such that the sidecar implements multi-instance purge, but that's not critical at this point.