sustainable-computing-io / kepler-operator

Kepler Operator
Apache License 2.0
26 stars 27 forks source link

.spec.volumes: field not declared in schema #381

Open jtaleric opened 8 months ago

jtaleric commented 8 months ago

Attempting to deploy, with

Following the docs to deploy the operator, I ran :

$  make deploy OPERATOR_IMG=quay.io/sustainable_computing_io/kepler-operator:v1alpha1
./hack/tools.sh kustomize
   ✅ kustomize matching v3.8.7 already installed
{Version:kustomize/v3.8.7 GitCommit:ad092cc7a91c07fdf63a2e4b7f13fa588a39af4f BuildDate:2020-11-11T23:14:14Z GoOs:linux GoArch:amd64}
./hack/tools.sh controller-gen
   ✅ controller-gen matching Version: v0.12.1 already installed
Version: v0.12.1
/home/jtaleric/code/kepler-operator/tmp/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./pkg/..."
/home/jtaleric/code/kepler-operator/tmp/bin/controller-gen rbac:roleName=manager-role crd webhook paths="./pkg/..." output:crd:artifacts:config=config/crd/bases
/home/jtaleric/code/kepler-operator/tmp/bin/kustomize build config/crd | \
    kubectl apply --server-side --force-conflicts -f -
customresourcedefinition.apiextensions.k8s.io/keplerinternals.kepler.system.sustainable.computing.io serverside-applied
customresourcedefinition.apiextensions.k8s.io/keplers.kepler.system.sustainable.computing.io serverside-applied
/home/jtaleric/code/kepler-operator/tmp/bin/kustomize build config/default | \
    sed  -e "s|<OPERATOR_IMG>|quay.io/sustainable_computing_io/kepler-operator:v1alpha1|g" \
         -e "s|<KEPLER_IMG>|quay.io/sustainable_computing_io/kepler:release-0.7.2|g" \
    | tee tmp/deploy.yaml | \
    kubectl apply --server-side --force-conflicts -f -
namespace/kepler-operator-system serverside-applied
customresourcedefinition.apiextensions.k8s.io/keplerinternals.kepler.system.sustainable.computing.io serverside-applied
customresourcedefinition.apiextensions.k8s.io/keplers.kepler.system.sustainable.computing.io serverside-applied
serviceaccount/kepler-operator-controller-manager serverside-applied
role.rbac.authorization.k8s.io/kepler-operator-leader-election-role serverside-applied
clusterrole.rbac.authorization.k8s.io/kepler-operator-manager-role serverside-applied
clusterrole.rbac.authorization.k8s.io/kepler-operator-metrics-reader serverside-applied
clusterrole.rbac.authorization.k8s.io/kepler-operator-proxy-role serverside-applied
rolebinding.rbac.authorization.k8s.io/kepler-operator-leader-election-rolebinding serverside-applied
clusterrolebinding.rbac.authorization.k8s.io/kepler-operator-manager-rolebinding serverside-applied
clusterrolebinding.rbac.authorization.k8s.io/kepler-operator-proxy-rolebinding serverside-applied
service/kepler-operator-controller-manager-metrics-service serverside-applied
service/kepler-operator-webhook-service serverside-applied
servicemonitor.monitoring.coreos.com/kepler-operator-controller-manager-metrics-monitor serverside-applied
mutatingwebhookconfiguration.admissionregistration.k8s.io/kepler-operator-mutating-webhook-configuration serverside-applied
validatingwebhookconfiguration.admissionregistration.k8s.io/kepler-operator-validating-webhook-configuration serverside-applied
Error from server: failed to create typed patch object (kepler-operator-system/kepler-operator-controller; apps/v1, Kind=Deployment): .spec.volumes: field not declared in schema
make: *** [Makefile:246: deploy] Error 1
jtaleric commented 8 months ago

The Deployment "kepler-operator-controller" is invalid: spec.template.spec.containers[0].volumeMounts[0].name: Not found: "cert"

vprashar2929 commented 8 months ago

@jtaleric Can you try running make fresh instead of make deploy?

jtaleric commented 8 months ago

@vprashar2929 so, make fresh seems to deploy a kind cluster? I am trying to use make deploy install on a existing cluster?

Chrisys93 commented 6 months ago

I want to do the same thing, on a bare metal (VM-based) cluster, deployed with kubespray, kubeadm.

I already have prometheus operator deployed, as well.

elafontaine commented 5 months ago

Anyone found out the workaround yet ?

vprashar2929 commented 5 months ago

@elafontaine We will address this post by merging this. In the meantime, as a workaround, I suggest you can use operator-sdk to install operator on k8s

make tools # making sure that all the necessary tools are installed locally including operator-sdk

./tmp/bin/operator-sdk olm install --verbose --timeout 5m # installs olm on k8s cluster

./tmp/bin/operator-sdk run bundle quay.io/sustainable_computing_io/kepler-operator-bundle:0.13.0 --install-mode AllNamespaces  --skip-tls -n openshift-operators # deploys operator using latest released bundle
elafontaine commented 5 months ago

The last command did not execute correctly for me...

FATA[0021] Failed to run bundle: create catalog: error creating catalog source: namespaces "openshift-operators" not found

I'm was trying to avoid dealing with the openshift related tooling, but to keep using the operators as they're very useful :).

Anyway, removing the last part of the command (the namespace option) I was able to deploy, but the kepler-operator-controlller never started up correctly.

./tmp/bin/operator-sdk run bundle quay.io/sustainable_computing_io/kepler-operator-bundle:0.13.0 --install-mode AllNamespaces
2024-07-01T18:50:27Z    ERROR   controller-runtime.source.EventHandler  if kind is a CRD, it should be installed before calling Start{"kind": "SecurityContextConstraints.security.openshift.io", "error": "failed to get restmapping: no matches for kind \"SecurityContextConstraints\" in group \"security.openshift.io\""}
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.1/pkg/internal/source/kind.go:63
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2
    /go/pkg/mod/k8s.io/apimachinery@v0.29.1/pkg/util/wait/loop.go:87
k8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext
    /go/pkg/mod/k8s.io/apimachinery@v0.29.1/pkg/util/wait/loop.go:88
k8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel
    /go/pkg/mod/k8s.io/apimachinery@v0.29.1/pkg/util/wait/poll.go:33
sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1
    /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.1/pkg/internal/source/kind.go:56
vprashar2929 commented 5 months ago

Ah, my bad you also need SCC CRD to be added. Can you also run these commands?

kubectl apply --force -f ./hack/crds
kubectl create namespace openshift-config-managed
kubectl create -f hack/monitoring/rbac
./tmp/bin/operator-sdk olm install --verbose --timeout 5m
./tmp/bin/operator-sdk run bundle quay.io/sustainable_computing_io/kepler-operator-bundle:0.13.0 --install-mode AllNamespaces  --skip-tls -n operators --timeout 5m
elafontaine commented 4 months ago

so I'm sorry, I didn't saw your message and I tried to start from the make cluster-up instead... That did not work either as the "make run" isn't passing.

2024-07-03T10:37:30-04:00   INFO    Wait completed, proceeding to shutdown the manager
2024-07-03T10:37:30-04:00   ERROR   setup   problem running manager {"error": "open /var/folders/p8/5wb68xbn4w9d4zzqn41gc9yw0000gp/T/k8s-webhook-server/serving-certs/tls.crt: no such file or directory"}
main.main
    /Users/elafonta/kepler/kepler-operator/cmd/manager/main.go:195
runtime.main
    /Users/elafonta/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.7.darwin-arm64/src/runtime/proc.go:267
exit status 1
make: *** [run] Error 1

I think the error may be related as it's a problem with the TLS volume again...

vprashar2929 commented 4 months ago

that's correct make run will not run as by default it looks for webhooks certs that are not present on the cluster. In this case, I suggest you can use make run ENABLE_WEBHOOKS=false

elafontaine commented 3 months ago

ok, I've been continuing to try to deploy the operator, but it does require the cert-manager and only running the make run ENABLE_WEBHOOKS=false seems to work, but the problem is that it keeps a process spawned up and I would much prefer to run an operator image pre-bundled. I'm going to try to figure out a way to keep working on my side, but I would very much prefer to be able to run "make cluster-up" and get a kind cluster with the operator already deployed.

Can you help me understand why is the certificate signing a requirement for kepler operator ?

vprashar2929 commented 2 months ago

418 now addresses this issue. Now you can run Kepler operator on the vanilla k8s cluster: https://github.com/sustainable-computing-io/kepler-operator?tab=readme-ov-file#run-kepler-operator-on-vanilla-kubernetes

@elafontaine @jtaleric can you try deploying the same Again apologies for late response 😅

vprashar2929 commented 2 months ago

prefer to be able to run "make cluster-up" and get a kind cluster with the operator already deployed.

@elafontaine Yes you can use make fresh which will create a Kind cluster with all the dependencies and will also build, deploy the Operator on the cluster

Can you help me understand why is the certificate signing a requirement for kepler operator ?

Operator make use of webhooks which need to communicate securely over HTTPS. That is why we need valid TLS certs to establish the connection