estuary / flow

🌊 Continuously synchronize the systems where your data lives, to the systems where you _want_ it to live, with Estuary Flow. 🌊
https://estuary.dev
Other
605 stars 51 forks source link

Architecture: Determine the best/most secure way to handle secure credentials in the Flow SaaS model #278

Closed snowzach closed 2 years ago

snowzach commented 2 years ago

We need to be able to provide Secure Configuration Storage of User's configurations.

We believe the most ideal tool to leverage is Hashicorp Vault. It's somewhat the standard in secure credential storage. This is attempting to outline the workflow and process for getting Secure Credentials in and out of Vault as well as how the Secure Configuration will be managed.

Definitions/Assumptions/Statements

Secure Credentials Creation Workflow

Secure Credentials View Workflow

Secure Credential Usage Workflow

Day One Features

For initial development we'll start with Static Credentials.

Day N Features

Vault supports a myriad of plugins for obtaining temporary credentials that we can add support for in the future. These present some challenges though as it's plausible connectors could run un-interrupted for months and thus some credentials will expire. We need to manage these challenges and ensure that credential expiration and rotations are handled appropriately by the connectors.

jgraettinger commented 2 years ago

Other braindumps / notes:

For user ergonomics, the separation of secure vs unsecure connector configuration should be as automated as possible. For example, rather than having two places to manage secure vs unsecure config, we should understand what parts of a connector's config are secure vs which are insecure, and separate the two at build time -- leaving a placeholder for secure portions of the config in the unsecured version that tell us how to resolve it later.

Which raises the question "How do you know what parts of the config are secure"? Ask the connector! The spec can be marked up with an annotation that distinguishes fields which are secure. This is also useful for UI form presentation (starring out secret fields). Airbyte's already done this with an airbyte_secret annotation.

There's a related question to this design which is "how should I manage credentials in my GitOps catalog" ? Solution sketches today include some existing tooling, like git-secret & others. Or, the user can use a UI workflow to create the connector. I don't think we've "solved" this issue either, but I do think we should consider it as a separate problem.

snowzach commented 2 years ago

@jgraettinger good thoughts! At this point in the process, the biggest question in my mind is still "How will the secure credentials be provided to the connector". I've come up with 4 options that each has it's own advantages and drawbacks. I wanted to write them all down with pros and cons before documenting here or bringing to the team. I am working on that now.

Basically the challenge (in my mind) is balancing:

  1. Simplicity.
  2. The Flow Runtime having to touch Secure Credentials. (In that ideally it doesn't need to)
  3. Credential renewal/cycling.
  4. Compatibility with the orchestration platform.
  5. Needing to make changes to the current adapters.
snowzach commented 2 years ago

Use Cases

These are the different things I believe will require being stored in the Secure Credential Storage.

NOTE: We have pretty much already decided to use Hashicorp Vault at this point, I will just refer to it as Vault from here on out.

snowzach commented 2 years ago

Credential Handling Options

For storing credentials, in pretty much all cases I believe the Management API has the job of saving Secure Credentials into the Vault. It should also make note of the credentials metadata (created/updated/name/security group/etc) in it's database.

The method for fetching and using credentials could be one of the following:

Flow Runtime Managed

In this scenario the Flow Runtime will communicate directly with Vault having full access to the credentials inside. When running a connector it will inject/merge the secure configuration with the relevant calls to the connector. If the credentials change, the Flow Runtime can restart the connector with the new credentials using the appropriate lifecycle. Flow also uses Vault credentials to launch connector runtimes.

Pros

Cons

Secure Credentials Injected at Connector Runtime

In this scenario the credentials are injected out-of-band to the connector runtimes via either environment variable or configuration files. Something like Vault Agent is used to fetch credentials and provide it to the connector.

Pros

Cons

Secure Runtime Connector

Another idea is to create a new connector type which I'll just call the Secure Runtime Connector = SRC. I was thinking this connector could be similar to the FlowSink connector but serve several purposes, the key ones being managing connector runtimes, securely proxing flow connector communication and fetching/providing Secure Credentials.

Pros

Cons

Example Architecture Options

Secure Runtime Connector drawio

snowzach commented 2 years ago

Questions

jgraettinger commented 2 years ago

We had an in-person conversation where we walked through various options. My understanding of the consensus:

We'll use Hashicorp's managed Vault service to power storing, retrieving, and auditing of credentials used within Flow. We'll seek to altogether avoid storing any credentials on our infrastructure by 1) separating credentials from configuration early in the catalog build process, 2) delegating credential storage to Vault, and 3) retrieving credentials and re-hydrating full connectors configurations at the very last mile: immediately before invoking the connector and only within the connector's isolation boundary.

Secure Connector Proxy

We'll develop a secure connector proxy (working name still TBD): a static binary which is placed inside the isolation boundary of the connector. Today, this means it runs within the Docker container of the connector. In the future, it means inside a Firecracker VM dedicated to the connector. It's most similar in spirit to the secure connector runtime discussed above.

This proxy will directly execute the connector as a subordinate process. The proxy will interface with the delegate connector using the delegate's chosen protocol (Airbyte spec over stdin/out, or Flow's native capture or materialization protocol, or anything else we use in the future).

It will similarly interface with the Flow runtime using Flow's native capture / materialization protocol (only). We'll consider the means of integration with the runtime to be a change-able implementation detail. Today it would speak the native protocol over stdin/out, but in a Firecracker future it might be simpler to run gRPC over an exposed port.

The secure connector proxy will perform last-mile re-hydration of secure credentials which are separated from a connector configuration at build time.

Build-Time Credential Separation

The control plane's build API accepts HTTPS POST'd catalog specifications which include connector configuration. That configuration includes a mix of less sensitive configuration as well as sensitive credentials. The build API will utilize schema annotations returned by the connector to identify locations of the configuration which represent secure credentials. It will extract credentials and push to Vault for storage, and then re-writes the configuration to replace credentials with references to a credential now stored in Vault.

Runtime Credential Injection

Immediately prior to starting a connector, the Flow runtime will obtain a wrapped token from Vault. The wrapped token will be passed to the secure connector proxy during initialization of the stream (as a field of the CaptureRequest of a capture, or TransactionRequest.Open of a materialization). Already today, these messages include the connector configuration, but this configuration will not yet include actual credentials.

The proxy will directly interact with Vault using the wrapped token, will fetch credentials, will merge them back into the connector configuration, and will then pass this restored configuration to the exec'd connector delegate.

(@snowzach you had a nice diagram of this ⬆️ workflow; mind attaching here?)

Limitation of Scope

At this time, we are not trying to design a general credential management feature which our users would directly interact with. Our use of Vault would be limited to a) writing credentials during a catalog build, and b) reading credentials for purposes of running a connector. There would not be a capability to fetch & expose a credential to a user, for example.

We are not trying to provide an opinion about how users should manage credentials within a Flow GitOps workflow -- there are a number of options for this, and we're still evaluating them ourselves.

We may introduce some form of user-facing credential management in the future, but expect it would be a separable concern from this secure credential delivery design. That's because a credential management feature would ultimately still need to produce and present fully-merged connector configurations to the build API, which would then separate and isolate credentials of that configuration using this design.

snowzach commented 2 years ago

Here's the current thought process around how the Secure Connector Proxy would work image

jgraettinger commented 2 years ago

Closing because we've achieved the goal of having an architectural consensus. Still more to figure out, but next steps are to start prototyping.