serverlessworkflow / specification

Serverless Workflow Specification
http://serverlessworkflow.io
Apache License 2.0
717 stars 146 forks source link

Accessing secrets in the workflow definition #866

Open matthias-pichler-warrify opened 2 months ago

matthias-pichler-warrify commented 2 months ago

Current Spec

The current spec defines the following:

document: {}
use:
  secrets:
    - passwordSecret
  authentications:
    basicTest:
      basic:
        username: user
        password: passwordSecret
document: {}
use:
  secrets:
    - basic.json
  authentications:
    basicTest:
      basic: basic.json

Some points that in my eyes cause a bit of friction:

Secrets are only available in authentication policies

This was new to me and I don't see it documented anywhere. While I agree that this reduces risk of leakage I don't think it's the best call given that some APIs out there are just weird. (If memory serves right the log intake API of Datadog allows an API key in the query). Currently it's not possible to access secrets in the endpoint/uri property.

I propose to allow referencing secrets anywhere.

Secrets can contain arbitrary values

document: {}
use:
  secrets:
    - basic.json
  authentications:
    basicTest:
      basic: basic.json

Here the secret basic.json replaces an object that contains username and password. If we were to allow secrets everywhere this could cause two problems:

  1. A lot of hidden structure
document: {}
use:
  secrets:
    - endpointWithApiKey
do:
  - getPetById:
      call: http
      with:
        endpoint: endpointWithApiKey

In this I as the reader do not know if the secret is a string just containing the uri, an object of the form {"uri": "..."} or if it contains a authentication policy as well. I think this hides a lot of structure from the reader

  1. everything would have to be oneOf: [ActualType, string]

If secrets could be referenced everywhere, everything would need to be defined as a string as well. This would produce an ugly and incomprehensible sche,a and also cause a lot of pain for implementers (think OO types of objects or strings).

I propose to limiting secrets to string or primitive values

Referencing secrets

Currently secrets are simply referenced by name: this makes the use of secrets ambiguous. (they become hard to distinguish from a string literal). It also causes cascading effects that might be surprising for example if I start with

document: {}
use:
  secrets:
    - passwordSecret
  authentications:
    basicTest:
      basic:
        username: user
        password: passwordSecret

and then delete (or fat-finger) the secret

document: {}
use:
  secrets:
    - passwordSecrets # accidentally added an "s"
  authentications:
    basicTest:
      basic:
        username: user
        password: passwordSecret

It would still run and the runtime could not really help me with "Secret not found" because it has to assume that "passwordSecret" is now a string literal.

I therefore propose a more structured way to reference secrets

I see two possibilities here and we might end up using both.

  1. reference secrets with a refObject

In line with https://github.com/serverlessworkflow/specification/issues/904 I would also suggest using an object to reference secrets

document: {}
use:
  secrets:
    - passwordSecret
  authentications:
    basicTest:
      basic:
        username: "user"
        password: 
          use: passwordSecret

as with the linked issue the wording around use is up for discussion (e.g. use, ref, etc...)

  1. $secrets argument in runtime expressions

Version 0.9 of the spec had the $SECRETS namespace. I would suggest specifying a $secrets argument for runtime expressions. It might me necessary to refer to some secrets in runtime expressions. For example some APIs use non standard authentication or headers.

document: {}
do:
  - api:
      call: http
      with:
        method: GET
        endpoint: https://example.com
        headers:
          X-Auth: ${ $secrets.username + ":" + $secrets.password | @base64 | "Custom "+ . }
cdavernas commented 2 months ago

this makes the use of secrets ambiguous already. (they become hard to distinguish from a string literal)

You are right, it should be more explicit. Maybe by adding a secret or a ref property.

Is it intended to be able to reference secrets in runtime expressions? If so how?

The exact same, excellent question, came up during the DSL refactor. To progress and to come up with a (first) attempt, we decided to address that concern later, later being now it seems 😆, the concern being to avoid as much as possible the bleeding of secrets.

If we allow the user to access them using runtime expressions - which can be used everywhere - we basically enable her to bleed secrets. If we still choose to proceed that way, we can choose to not care about what the user do with secrets, as long as we warned her about potential leaks of sensitive information. Otherwise, we can choose to redact secret values in specific cases, like GitHub does, for example. I'm not a fan of that solution, because it's tedious to implement, and we'd have to define all the cases where secrets should be redacted, and the cases where they should not.

We could also, like for $context, define that $secrets are only made available during input/output filtering. This is IMHO the solution that makes most sense, as we keep atomicity without implicitly compromising secrets (ex: invoking a FaaS with the filtered input, avoiding having to pass both $context and $secrets to atomic "processors").

fjtirado commented 2 months ago

I support adding $secret to the spec. I would also add $env or $property to access any property.

cdavernas commented 1 month ago

In the context of the PR addressing #904, and in order to have a working basic state to fix, hone and improve, I propose that:

Replace the actual secret referencing way:

use:
  secrets:
    - basic-auth-secret.json
  authentications:
    basic: basic-auth-secret.json

... which is ugly and sacrifies meaningfullness for the sake of brievety, therefore complexifying the schema with yet another oneOf, by the following:

use:
  secrets:
    - basic-auth-secret.json
  authentications:
    basic: 
      use: basic-auth-secret.json

Which is more consistent regarding typing, avoids oneOf and is more fluent and ubiquitous.

matthias-pichler-warrify commented 1 month ago

I think that would be a great first step and can be included in the PR for #904

After that we can talk about the other points

cdavernas commented 1 month ago

@matthias-pichler-warrify After our small talk on the subject, I had time to think and play around with your suggestions.

Referencing secrets

As made obvious thanks to your observations, we need to add a new secrets parameter. I propose to restrict its availability to the input.from property, allowing explicit use of secrets in that runtime expression only in order to better control potential bleeding: if author wants to use a specific secret in the task, she explicitly has to "ask" for it using the input.from expression.

Aside from runtime expressions, I think that referencing by secrets by name should be restricted to authorization only, which is really both for user convenience and code brievety (based on experience with previous DSL versions)

use:
  secrets:
    sampleApi:
      from: sample.api.json
    sampleBasic:
      from: basic.auth.json
  authorizations:
    basic:
      use: sampleBasic
do:
  - getProfile:
      call: http
      with:
        endpoint: ${ "https://auth.services.sample.com?apiKey=\($secrets.sampleApi.key)" }
matthias-pichler-warrify commented 1 month ago

I propose to restrict its availability to the input.from property

Hmm yeah that might make sense. Ideally implementers would track that reference through the input to hide the value in any logs etc.

But that would also mean that your example is incorrect right? You'd need:

use:
  secrets:
    sampleApi:
      from: sample.api.json
    sampleBasic:
      from: basic.auth.json
  authorizations:
    basic:
      use: sampleBasic
do:
  - getProfile:
      input:
        from: '. + {"apiKey": $secrets.sampleApi.key }'
      call: http
      with:
        endpoint: ${ "https://auth.services.sample.com?apiKey=\(.apiKey)" }

referencing by secrets by name should be restricted to authorization only,

that also sounds good to me

cdavernas commented 1 month ago

But that would also mean that your example is incorrect right?

Indeed, sorry about that!