microsoft / durabletask-mssql

Microsoft SQL storage provider for Durable Functions and the Durable Task Framework
MIT License
87 stars 32 forks source link

Always encrypted support #165

Open andsj073 opened 1 year ago

andsj073 commented 1 year ago

Hello

I am wondering if it is already possible to enable Always Encrypted / Column Encryption on the TaskHub database? And if not, if you are considering it?

I have tried to make it work but have so far been unsuccessful and guess it doesn't either.

So what do I want to accomplish and why? I want to turn on SQL Server/Database Column Encryption on the Payloads.Text column of the TaskHub. This because the Durable Function we are developing will handle highly sensitive data as payload and we want to protect it with application level encryption, thus ensuring that not even the database owners/admins can read the payload data (as they will not also have access to the master key in the Key Vault - i.e. technically enforcing strong segregation of duties).

I tried assigning the Function App of the Durable Function both System and User assigned managed identities (that were enabled as Users in the Database as well) with the right role assignment to the Key Vault key, and with the connection string parts Authentication=Active Directory Managed Identity; Column Encryption Setting=enabled

I also added the Nuget Microsoft.Data.SqlClient.AlwaysEncrypted.AzureKeyVaultProvider to the Durable Functions project before deployment.

To no avail

Looking forward to hear you recommendations and/or if this will make it to the backlog for consideration Thank you!

cgillum commented 1 year ago

@andsj073 it's definitely the goal that we support Always Encrypted for exactly the reason you described. I'm sorry to hear that you weren't able to get it working. We'll need to research this more closely and add some documentation providing the right steps, once we figure them out. :) In the meantime, do let us know if you are able to make any progress on getting this scenario working.

andsj073 commented 1 year ago

@cgillum I almost got this to work now. But I think some piece is missing in the Provider implementation.

In my Durable Function implementation I needed to register the AKV provider to the SQLClient library, like this:

[assembly: FunctionsStartup(typeof(Company.Function.Startup))]

namespace Company.Function
{

    public class Startup : FunctionsStartup
    {
        public override void Configure(IFunctionsHostBuilder builder)
        {

            Console.WriteLine("FunctionsStartup.Configure");
            // Initialize Token Credential instance using InteractiveBrowserCredential. For other authentication options,
            string userAssignedClientId = "REDACTED";
            var credential = new DefaultAzureCredential(new DefaultAzureCredentialOptions { ManagedIdentityClientId = userAssignedClientId });

            // Initialize AKV provider
            SqlColumnEncryptionAzureKeyVaultProvider akvProvider = new SqlColumnEncryptionAzureKeyVaultProvider(credential);

            // Register AKV provider
            SqlConnection.RegisterColumnEncryptionKeyStoreProviders(customProviders: new Dictionary<string, SqlColumnEncryptionKeyStoreProvider>(capacity: 1, comparer: StringComparer.OrdinalIgnoreCase)
                {
                    { SqlColumnEncryptionAzureKeyVaultProvider.ProviderName, akvProvider}
                });
            Console.WriteLine("AKV provider Registered");

        } 
}

And add the ;Column Encryption Setting=enabled to the connections string

Using SSMS to encrypt the Payloads.Text column and storing the Master key in Key Vault.

Setting up access of the Function App service assigned MI to the Key Vault.

It all works so far that a NewEvents row is added with a corresponding Instances row and a Payloads row, with the data in the Text column encrypted.

However, thereafter the Orchestration Function is never executed successfully and the NewEvents row remains in pending state.

I guess that somewhere in the execution coming from the polling mechanism of the Durable Framework that polls the SQL TaskHub provider for new Events the underlying SQLClient provider does not have the AKV Provider registered and fails each time.

cgillum commented 1 year ago

@andsj073 thanks for the update on this. It's interesting that you were able to get it to work as far as encrypting data, but I'm curious to understand why it doesn't seem to be working for unencrypting the data.

Is there any documentation you can point to that explains how this is normally expected to be set up?

andsj073 commented 1 year ago

@cgillum I guessing here, but think it could a matter of scope/context. When the Function starts/is triggered, the AKV prover for sql client is registered (in the startup-configure method) and thus creating a new instance and event is successful.

But the polling for new events happens in another scope/context (by the Durable Functions framework) which doesn’t have the AKV prover registered - I’m guessing - which then fails.

That would/could explain why a row with encrypted data is created but then nothing more.

cgillum commented 1 year ago

Makes sense. You never really know what the behavior will be when you call a global/static method like SqlConnection.RegisterColumnEncryptionKeyStoreProviders.

Based on what you've discovered, it seems like we may need to create a new API for registering custom column encryption key store providers when using the MSSQL provider. For Azure Functions, we can probably expose this via host.json settings, making it easier to configure. I see you're specifying a specific user-assigned client ID, so that would be one such setting. I assume there may be a few others that some users would want.

andsj073 commented 1 year ago

@cgillum Just to add some kind of confirmation to the hypothesis I implemented a facade sub class of SqlColumnEncryptionAzureKeyVaultProvider to catch calls to its methods, and could see that the method DecryptColumnEncryptionKey was called as a result of the Function Http Trigger that starts the Durable Orchestration but never later when the Durable Function framework executes by scheduled polling of the TaskHub.

[2023-05-04T06:26:14.751Z] Found C:\vscode\abc.csproj. Using for user secrets file configuration.
FunctionsStartup.Configure
AKV provider Registered

Functions:

        DurableFunctionsOrchestrationCSharp1_HttpStart: [GET,POST] http://localhost:7071/api/DurableFunctionsOrchestrationCSharp1_HttpStart

        DurableFunctionsOrchestrationCSharp1: orchestrationTrigger

        SayHello: activityTrigger

For detailed output, run func with --verbose flag.
[2023-05-04T06:26:21.987Z] Host lock lease acquired by instance ID '000000000000000000000000228EE9B8'.
[2023-05-04T06:27:07.706Z] Executing 'DurableFunctionsOrchestrationCSharp1_HttpStart' (Reason='This function was programmatically called via the host APIs.', Id=194f4cbc-42cf-44a2-a2c5-429a4e8db545)
DecryptColumnEncryptionKey
[2023-05-04T06:27:21.305Z] Started orchestration with ID = 'f8819280f27147b0b568775f2fc5e18e'.
[2023-05-04T06:27:21.328Z] Executed 'DurableFunctionsOrchestrationCSharp1_HttpStart' (Succeeded, Id=194f4cbc-42cf-44a2-a2c5-429a4e8db545, Duration=13642ms)
[2023-05-04T06:27:23.574Z] Executing 'DurableFunctionsOrchestrationCSharp1' (Reason='(null)', Id=fdb8a791-102b-4d7d-a235-8c114007377a)
[2023-05-04T06:27:23.607Z] Executed 'DurableFunctionsOrchestrationCSharp1' (Succeeded, Id=fdb8a791-102b-4d7d-a235-8c114007377a, Duration=35ms)

complete code for replication

using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Threading.Tasks;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Extensions.DurableTask;
using Microsoft.Azure.WebJobs.Extensions.Http;
using Microsoft.Extensions.Logging;
using Microsoft.Azure.Functions.Extensions.DependencyInjection;
using Azure.Identity;
using Microsoft.Data.SqlClient;
using Microsoft.Data.SqlClient.AlwaysEncrypted.AzureKeyVaultProvider;
using Azure.Core;

[assembly: FunctionsStartup(typeof(Company.Function.Startup))]

namespace Company.Function
{

    public class Startup : FunctionsStartup
    {
        public override void Configure(IFunctionsHostBuilder builder)
        {

            Console.WriteLine("FunctionsStartup.Configure");
            string userAssignedClientId = Environment.GetEnvironmentVariable("UserAssignedClientId");
            var credential = new DefaultAzureCredential(new DefaultAzureCredentialOptions { ManagedIdentityClientId = userAssignedClientId });

            // Initialize AKV provider
            SqlColumnEncryptionAzureKeyVaultProvider akvProvider = new MySqlColumnEncryptionAzureKeyVaultProvider(credential);

            // Register AKV provider
            SqlConnection.RegisterColumnEncryptionKeyStoreProviders(customProviders: new Dictionary<string, SqlColumnEncryptionKeyStoreProvider>(capacity: 1, comparer: StringComparer.OrdinalIgnoreCase)
                {
                    { SqlColumnEncryptionAzureKeyVaultProvider.ProviderName, akvProvider}
                });
            Console.WriteLine("AKV provider Registered");

        }
    }

    public static class DurableFunctionsOrchestrationCSharp1
    {
        [FunctionName("DurableFunctionsOrchestrationCSharp1")]
        public static async Task<List<string>> RunOrchestrator(
            [OrchestrationTrigger] IDurableOrchestrationContext context)
        {
            var outputs = new List<string>();

            // Replace "hello" with the name of your Durable Activity Function.
            outputs.Add(await context.CallActivityAsync<string>(nameof(SayHello), "Tokyo"));
            outputs.Add(await context.CallActivityAsync<string>(nameof(SayHello), "Seattle"));
            outputs.Add(await context.CallActivityAsync<string>(nameof(SayHello), "London"));

            // returns ["Hello Tokyo!", "Hello Seattle!", "Hello London!"]
            return outputs;
        }

        [FunctionName(nameof(SayHello))]
        public static string SayHello([ActivityTrigger] string name, ILogger log)
        {
            log.LogInformation("Saying hello to {name}.", name);
            return $"Hello {name}!";
        }

        [FunctionName("DurableFunctionsOrchestrationCSharp1_HttpStart")]
        public static async Task<HttpResponseMessage> HttpStart(
            [HttpTrigger(AuthorizationLevel.Anonymous, "get", "post")] HttpRequestMessage req,
            [DurableClient] IDurableOrchestrationClient starter,
            ILogger log)
        {
            // Function input comes from the request content.
            string instanceId = await starter.StartNewAsync("DurableFunctionsOrchestrationCSharp1", null, "input");

            log.LogInformation("Started orchestration with ID = '{instanceId}'.", instanceId);

            return starter.CreateCheckStatusResponse(req, instanceId);
        }
    }

    public class MySqlColumnEncryptionAzureKeyVaultProvider : SqlColumnEncryptionAzureKeyVaultProvider
    {
        public MySqlColumnEncryptionAzureKeyVaultProvider(TokenCredential tokenCredential) : base(tokenCredential)
        {
        }

        public MySqlColumnEncryptionAzureKeyVaultProvider(TokenCredential tokenCredential, string trustedEndPoint) : base(tokenCredential, trustedEndPoint)
        {
        }

        public MySqlColumnEncryptionAzureKeyVaultProvider(TokenCredential tokenCredential, string[] trustedEndpoints) : base(tokenCredential, trustedEndpoints)
        {
        }

        public override TimeSpan? ColumnEncryptionKeyCacheTtl { get; set; }

        public override byte[] DecryptColumnEncryptionKey(string masterKeyPath, string encryptionAlgorithm, byte[] encryptedColumnEncryptionKey) 
        {
            Console.WriteLine("DecryptColumnEncryptionKey");
            return base.DecryptColumnEncryptionKey(masterKeyPath, encryptionAlgorithm, encryptedColumnEncryptionKey);
        }

        public override byte[] EncryptColumnEncryptionKey(string masterKeyPath, string encryptionAlgorithm, byte[] columnEncryptionKey) 
        {
            Console.WriteLine("EncryptColumnEncryptionKey");
            return base.EncryptColumnEncryptionKey(masterKeyPath, encryptionAlgorithm, columnEncryptionKey);
        }

        public override byte[] SignColumnMasterKeyMetadata(string masterKeyPath, bool allowEnclaveComputations) 
        {
            Console.WriteLine("SignColumnMasterKeyMetadata");
            return base.SignColumnMasterKeyMetadata(masterKeyPath, allowEnclaveComputations);
        }

        public override bool VerifyColumnMasterKeyMetadata(string masterKeyPath, bool allowEnclaveComputations, byte[] signature)
        {
            Console.WriteLine("VerifyColumnMasterKeyMetadata");
            return base.VerifyColumnMasterKeyMetadata(masterKeyPath, allowEnclaveComputations, signature);
        }

    }
}

host.json

{
  "version": "2.0",
  "logging": {
    "applicationInsights": {
      "samplingSettings": {
        "isEnabled": true,
        "excludedTypes": "Request"
      }
    }
  },
  "extensions": {
    "durableTask": {
      "storageProvider": {
        "type": "mssql",
        "connectionStringName": "SQLDB_Connection",
        "taskEventLockTimeout": "00:02:00",
        "createDatabaseIfNotExists": true
      }
    }
  }
}

local.settings.json

{
  "IsEncrypted": false,
  "Values": {
    "AzureWebJobsStorage": "UseDevelopmentStorage=true",
    "FUNCTIONS_WORKER_RUNTIME": "dotnet",
    "SQLDB_Connection": "Server=tcp:REDACTED.database.windows.net,1433;Initial Catalog=DurableDB;Encrypt=True;TrustServerCertificate=False;Connection Timeout=30;Authentication=Active Directory Default;Column Encryption Setting=enabled",
    "UserAssignedClientId": "REDACTED"
  }
}