buildbarn / bb-storage

Storage daemon, capable of storing data for the Remote Execution protocol
Apache License 2.0
142 stars 91 forks source link

Feature request: Support JWKS for specifying JWT public keys #165

Closed thirtyseven closed 1 year ago

thirtyseven commented 1 year ago

Many services that issue JWTs publish their signing keys in JWKS format at a well known URL. These can be manually converted to PEMs, but it would be convenient to pass in JWKSes directly, either inline or via a URL.

Example: https://gitlab.com/-/jwks

EdSchouten commented 1 year ago

Just to provide some context: I haven't spent any effort to implement support for JWKS right now, for the reason that it requires network transparency. It's not uncommon for people to launch Buildbarn in a separate cloud account/VPC that is isolated from the internet/corporate networks. In that case there is no option but to hardcode the keys in config.

Maybe the best solution is not to support fetching of JWKS integrally, but to have some kind of separate cron job that downloads the JWKS and converts it into a config map for the Buildbarn executables to load? Though I'm happy to be persuaded otherwise.

With regards to the format in which key material is specified in config: JWKS may indeed be a better fit than what we have right now.

thirtyseven commented 1 year ago

It would be nice if both options (URL fetching and inline) were provided, this is how our reverse proxy allows them to be specified: https://istio.io/v1.16/docs/reference/config/security/jwt/#JWTRule

EdSchouten commented 1 year ago

I would seriously vote against that. The downside of letting each of the bb* components accept this by URL is that if the JWKS service is down, you will end up in a state where your bb* applications won't be able to start. Or at least not authenticate incoming requests.

Running the fetching of the JWKS as some kind of central cron job that writes its results into, say, a Kubernetes configmap is far more robust.

mortenmj commented 1 year ago

This was largely solved by https://github.com/buildbarn/bb-storage/pull/179, while the ability to automatically refresh a file containing JWKS data is pending in https://github.com/buildbarn/bb-storage/pull/180. In order to use this, we run a deployment that periodically updates a configmap which is mounted by bb-storage. It might be useful for others reading this, so I'm sharing the code:

package main

import (
    "context"
    "encoding/json"
    "fmt"
    "log"
    "time"

    "github.com/lestrrat-go/jwx/v2/jwk"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    applyCorev1 "k8s.io/client-go/applyconfigurations/core/v1"
    "k8s.io/client-go/kubernetes"
    "k8s.io/client-go/rest"
)

const (
    googleCerts = "https://www.googleapis.com/oauth2/v3/certs"
    namespace   = "buildbarn"
)

func main() {
    ctx := context.Background()

    // Set up JWKS cache
    cache, err := createCache(ctx, googleCerts)
    if err != nil {
        log.Fatalf("failed to create cache: %v", err)
    }

    // Set up k8s clientset
    config, err := rest.InClusterConfig()
    if err != nil {
        log.Fatalf("failed to create in-cluster config: %v", err)
    }

    clientset, err := kubernetes.NewForConfig(config)
    if err != nil {
        log.Fatalf("failed to create clientset config: %v", err)
    }

    // Periodically update ConfigMap forever
    errCount := 0
    t := time.NewTicker(300 * time.Second)
    for range t.C {
        if err := updateConfigMap(ctx, clientset, cache); err != nil {
            log.Printf("Failed to update ConfigMap: %s", err)

            errCount++

            if errCount >= 3 {
                log.Fatal("Failed too many consecutive times. Shutting down.")
            }

            continue
        }

        errCount = 0
    }
}

func createCache(ctx context.Context, jwksURL string) (*jwk.Cache, error) {
    cache := jwk.NewCache(ctx, jwk.WithRefreshWindow(300*time.Second))

    if err := cache.Register(jwksURL); err != nil {
        return nil, fmt.Errorf("register cache: %w", err)
    }

    // Refresh the JWKS once before returning the cache.
    _, err := cache.Refresh(ctx, jwksURL)
    if err != nil {
        return nil, fmt.Errorf("refresh cache: %w", err)
    }

    return cache, nil
}

func updateConfigMap(ctx context.Context, clientset *kubernetes.Clientset, cache *jwk.Cache) error {
    keyset, err := cache.Get(ctx, googleCerts)
    if err != nil {
        return fmt.Errorf("fetching JWKS: %w", err)
    }

    data, err := json.Marshal(keyset)
    if err != nil {
        return fmt.Errorf("marshal JSON: %w", err)
    }

    cm := applyCorev1.
        ConfigMap("jwks", namespace).
        WithBinaryData(map[string][]byte{"jwks.json": data})

    _, err = clientset.CoreV1().ConfigMaps(namespace).Apply(ctx, cm, metav1.ApplyOptions{
        Force:        true,
        FieldManager: "JWKS",
    })
    if err != nil {
        return fmt.Errorf("apply configmap: %w", err)
    }

    return nil
}