FoundationDB / fdb-kubernetes-operator

A kubernetes operator for FoundationDB
Apache License 2.0
241 stars 82 forks source link

Current deployment main is broken #1081

Closed johscheuer closed 2 years ago

johscheuer commented 2 years ago

What happened?

When installing the operator with kubectl apply -f https://raw.githubusercontent.com/foundationdb/fdb-kubernetes-operator/master/config/samples/deployment.yaml the operator deployment will be in a crash loop:

kubectl logs -f fdb-kubernetes-operator-controller-manager-b7cd98c84-f2grt
flag provided but not defined: -watch-namespace
Usage of /manager:
  -cleanup-old-cli-logs
        Defines if the operator should delete old fdbcli log files. (default true)
  -cli-timeout int
        The timeout to use for CLI commands. (default 10)
  -compress
        Defines whether the rotated log files should be compressed using gzip or not.
  -enable-leader-election
        Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager. (default true)
  -kubeconfig string
        Paths to a kubeconfig. Only required if out-of-cluster.
  -label-selector string
        Defines a label-selector that will be used to select resources.
  -leader-election-id string
        LeaderElectionID determines the name of the resource that leader election will use for holding the leader lock. (default "fdb-kubernetes-operator")
  -log-file string
        The path to a file to write logs to.
  -log-file-max-age int
        Defines the maximum age to retain old operator log file in number of days. (default 28)
  -log-file-max-size int
        Defines the maximum size in megabytes of the operator log file before it gets rotated. (default 250)
  -log-file-min-age duration
        Defines the minimum age of fdbcli log files before removing when "--cleanup-old-cli-logs" is set. (default 5m0s)
  -max-concurrent-reconciles int
        Defines the maximum number of concurrent reconciles for all controllers. (default 1)
  -max-old-log-files int
        Defines the maximum number of old operator log files to retain. (default 3)
  -metrics-addr string
        The address the metric endpoint binds to. (default ":8080")
  -use-future-defaults
        Apply defaults from the next major version of the operator. This is only intended for use in development.
  -version
        Prints the version of the operator and exits.
  -zap-devel
        Development Mode defaults(encoder=consoleEncoder,logLevel=Debug,stackTraceLevel=Warn). Production Mode defaults(encoder=jsonEncoder,logLevel=Info,stackTraceLevel=Error)
  -zap-encoder value
        Zap log encoding (one of 'json' or 'console')
  -zap-log-level value
        Zap Level to configure the verbosity of logging. Can be one of 'debug', 'info', 'error', or any integer value > 0 which corresponds to custom debug levels of increasing verbosity
  -zap-stacktrace-level value
        Zap Level at and above which stacktraces are captured (one of 'info', 'error', 'panic').

The flag was introduced in https://github.com/FoundationDB/fdb-kubernetes-operator/pull/1046 and all deployments were. adjusted but the issue is that those deployments refer to a tag version of the operator.

I think it would be more consistent to use latest in our deployments from the main branch and only use a specific version in our release tags. Otherwise we potentially install a newer CRD with features that are not supported in that operator deployment.

What did you expect to happen?

That the following steps bring up the operator in a running state:

kubectl apply -f https://raw.githubusercontent.com/FoundationDB/fdb-kubernetes-operator/master/config/crd/bases/apps.foundationdb.org_foundationdbclusters.yaml
kubectl apply -f https://raw.githubusercontent.com/FoundationDB/fdb-kubernetes-operator/master/config/crd/bases/apps.foundationdb.org_foundationdbbackups.yaml
kubectl apply -f https://raw.githubusercontent.com/FoundationDB/fdb-kubernetes-operator/master/config/crd/bases/apps.foundationdb.org_foundationdbrestores.yaml
kubectl apply -f https://raw.githubusercontent.com/foundationdb/fdb-kubernetes-operator/master/config/samples/deployment.yaml

How can we reproduce it (as minimally and precisely as possible)?

Run:

kubectl apply -f https://raw.githubusercontent.com/FoundationDB/fdb-kubernetes-operator/master/config/crd/bases/apps.foundationdb.org_foundationdbclusters.yaml
kubectl apply -f https://raw.githubusercontent.com/FoundationDB/fdb-kubernetes-operator/master/config/crd/bases/apps.foundationdb.org_foundationdbbackups.yaml
kubectl apply -f https://raw.githubusercontent.com/FoundationDB/fdb-kubernetes-operator/master/config/crd/bases/apps.foundationdb.org_foundationdbrestores.yaml
kubectl apply -f https://raw.githubusercontent.com/foundationdb/fdb-kubernetes-operator/master/config/samples/deployment.yaml

Anything else we need to know?

-

FDB Kubernetes operator

-

Kubernetes version

-

Cloud provider

-

brownleej commented 2 years ago

I think if we're going to update our deployments to use a new flag, we will need to update the version of the image we're using. Alternatively, we could defer the updates to the sample deployments into a follow-up task from the original PR that adds the flag. I don't think we want our own deployments to always run on the latest tag, so I'm hesitant to recommend that to others through our samples.

johscheuer commented 2 years ago

I think to defer that change make sense (and opening an issue to track it). My point here is that we currently already install the latest version from the CRD and if I would run those comments I would expect to get the latest state from main, at least that's what the path implies.

kubectl apply -f https://raw.githubusercontent.com/FoundationDB/fdb-kubernetes-operator/main/config/crd/bases/apps.foundationdb.org_foundationdbclusters.yaml
kubectl apply -f https://raw.githubusercontent.com/FoundationDB/fdb-kubernetes-operator/main/config/crd/bases/apps.foundationdb.org_foundationdbbackups.yaml
kubectl apply -f https://raw.githubusercontent.com/FoundationDB/fdb-kubernetes-operator/main/config/crd/bases/apps.foundationdb.org_foundationdbrestores.yaml
kubectl apply -f https://raw.githubusercontent.com/foundationdb/fdb-kubernetes-operator/main/config/samples/deployment.yaml

If a user really wants to install a specific version not just for playing around they should install those files from tag. Otherwise the CRD and the operators capabilities might not match and the CRD has some fields that are unknown to the operator.