operator-framework / operator-sdk

SDK for building Kubernetes applications. Provides high level APIs, useful abstractions, and project scaffolding.
https://sdk.operatorframework.io
Apache License 2.0
7.21k stars 1.74k forks source link

`WATCH_NAMESPACE` environment variable doesn't work for the Helm Operator #6780

Closed gschlueter-jaconi closed 1 month ago

gschlueter-jaconi commented 3 months ago

Bug Report

What did you do?

Cloned the operator-sdk repo and tried out the example in testdata/helm/memcached-operator in a local cluster with the WATCH_NAMESPACE environment variable set to default:

  1. Check out the repo and go to the example dir:

    git clone git@github.com:operator-framework/operator-sdk.git
    cd operator-sdk/testdata/helm/memcached-operator
  2. Create the cluster:

    kind create cluster --image kindest/node:v1.28.0
  3. Start the operator with the env variable set:

    WATCH_NAMESPACE=default make install run
  4. Create a memcached instance (in the default namespace):

    kubectl apply -k config/samples
  5. Observe the operator log:

    /usr/local/bin/kustomize build config/crd | kubectl apply -f -
    customresourcedefinition.apiextensions.k8s.io/memcacheds.cache.example.com created
    /usr/local/bin/helm-operator run
    {"level":"info","ts":"2024-07-04T12:42:14+02:00","logger":"cmd","msg":"Version","Go Version":"go1.21.11","GOOS":"darwin","GOARCH":"amd64","helm-operator":"v1.35.0","commit":"e95abdbd5ccb7ca0fd586e0c6f578e491b0a025b"}
    {"level":"info","ts":"2024-07-04T12:42:14+02:00","logger":"cmd","msg":"Watching namespaces","namespaces":["default"]}
    {"level":"info","ts":"2024-07-04T12:42:14+02:00","logger":"helm.controller","msg":"Watching resource","apiVersion":"cache.example.com/v1alpha1","kind":"Memcached","reconcilePeriod":"1m0s"}
    {"level":"info","ts":"2024-07-04T12:42:14+02:00","msg":"starting server","kind":"health probe","addr":"[::]:8081"}
    {"level":"info","ts":"2024-07-04T12:42:14+02:00","logger":"controller-runtime.metrics","msg":"Starting metrics server"}
    {"level":"info","ts":"2024-07-04T12:42:14+02:00","logger":"controller-runtime.metrics","msg":"Serving metrics server","bindAddress":":8080","secure":false}
    {"level":"info","ts":"2024-07-04T12:42:14+02:00","msg":"Starting EventSource","controller":"memcached-controller","source":"kind source: *unstructured.Unstructured"}
    {"level":"info","ts":"2024-07-04T12:42:14+02:00","msg":"Starting Controller","controller":"memcached-controller"}
    {"level":"info","ts":"2024-07-04T12:42:14+02:00","msg":"Starting workers","controller":"memcached-controller","worker count":16}

What did you expect to see?

A Memcached StatefulSet should be created. It works as expected when I omit the WATCH_NAMESPACE variable.

What did you see instead? Under which circumstances?

Nothing happens and the log shows no entry after the startup.

Environment

Operator type:

/language helm

Kubernetes cluster type:

vanilla (Kind)

$ operator-sdk version

operator-sdk version: "v1.35.0", commit: "e95abdbd5ccb7ca0fd586e0c6f578e491b0a025b", kubernetes version: "v1.28.0", go version: "go1.21.11", GOOS: "darwin", GOARCH: "amd64"

$ go version (if language is Go)

$ kubectl version

Client Version: v1.29.2 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.28.0

Possible Solution

Additional context

kovayur commented 2 months ago

I also encountered the same issue. After some debugging, I found the root cause: the default label selector is applied to the CR watch because no label selector defined in per-namespace default configuration. When watching all namespaces, per-namespace config has labels.Everything() label selector that takes precedence.

As a workaround, I added the label helm.sdk.operatorframework.io/chart: <chart name> to the CR and it worked. One caveat: I'm not a maintainer and I'm not sure if this won't cause unwanted side effects, so use at your own risk :)

As for the fix, the most correct option I can think of is to apply the configuration in options.Cache.DefaultNamespaces to selectorsByObject in cmd.go:

selectorsByObject[crObj] = cache.ByObject{Label: sel, Namespaces: ...}
kovayur commented 2 months ago

After taking another look, it seems to me that it's a bug in controller-runtime. From ByObject documentation.

// The defaulting follows the following precedence order:
// 1. ByObject
// 2. DefaultNamespaces[namespace]
// 3. Default*

In the helm operator, the label selector is always set in ByObject so it must always take precedence.

kovayur commented 2 months ago

Found a related issue: https://github.com/kubernetes-sigs/controller-runtime/issues/2804