Closed arpan57 closed 1 month ago
Given that you're on a mac I'd try a default minikube install. and when you're installing the operator, try using the dist file.
kubectl apply -f https://raw.githubusercontent.com/hyperspike/valkey-operator/main/dist/install.yaml
I deleted all the minkube profile and started from the scratch to avoid any left overs.
I ran kubectl apply -f https://raw.githubusercontent.com/hyperspike/valkey-operator/main/dist/install.yaml
I can see the valkey-operator-system and its pod running successfully.
The valkey-sample-n pods are stuck in a crashback loop.
k describe pod valkey-sample-0
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 13m default-scheduler 0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.
Normal Scheduled 13m default-scheduler Successfully assigned default/valkey-sample-0 to north-m02
Normal Pulling 13m kubelet Pulling image "docker.io/bitnami/valkey-cluster:7.2.5-debian-12-r4"
Normal Pulled 13m kubelet Successfully pulled image "docker.io/bitnami/valkey-cluster:7.2.5-debian-12-r4" in 35.808s (35.808s including waiting). Image size: 172964760 bytes.
Warning Unhealthy 12m (x5 over 12m) kubelet Liveness probe failed: Could not connect to Valkey at localhost:6379: Connection refused
Normal Killing 12m kubelet Container valkey failed liveness probe, will be restarted
Normal Created 12m (x2 over 13m) kubelet Created container valkey
Normal Started 12m (x2 over 13m) kubelet Started container valkey
Warning Unhealthy 12m kubelet Readiness probe failed:
Normal Pulled 12m kubelet Container image "docker.io/bitnami/valkey-cluster:7.2.5-debian-12-r4" already present on machine
Warning Unhealthy 8m37s (x57 over 12m) kubelet Readiness probe failed: Could not connect to Valkey at localhost:6379: Connection refused
Warning BackOff 3m27s (x17 over 7m32s) kubelet Back-off restarting failed container valkey in pod valkey-sample-0_default(a9b33087-3f1f-4dda-88c8-005bc236d001)
this is most likely due to minikube storage provider not supporting non-root access https://github.com/kubernetes/minikube/issues/1990
you can try applying the storage hack in scripts/
kubectl apply -f scripts/minikube-pvc-hack.yaml
The make minikube
should now work much better on macros.
Container images should now properly download (they're public now)
And the storage hack is part of the startup script
make minikube
kubectl apply -f https://github.com/hyperspike/valkey-operator/dist/install.yaml
I have to say - this time it was much smoother.
By https://github.com/hyperspike/valkey-operator/dist/install.yaml
Did you mean :
https://github.com/hyperspike/valkey-operator/blob/main/dist/install.yaml
? ( Resource with the mentioned link couldn't found) Instead I executed :
kubectl apply -f /path/to/valkey-operator/dist/install.yaml
- which seemed to work
Post that ran samples
kubectl apply -k config/samples/
However, the containers don't start completely.
> k get pods
NAME READY STATUS RESTARTS AGE
prometheus-operator-7b87d59796-f95zc 1/1 Running 0 18m
prometheus-prometheus-0 2/2 Running 0 17m
valkey-sample-0 0/1 Running 4 (50s ago) 5m24s
valkey-sample-1 0/1 Running 5 (5s ago) 5m24s
valkey-sample-2 0/1 Running 4 (35s ago) 5m24s
The logs from an example pod
❯ k logs valkey-sample-0 -f
valkey-cluster 12:37:01.53 INFO ==>
valkey-cluster 12:37:01.53 INFO ==> Welcome to the Bitnami valkey-cluster container
valkey-cluster 12:37:01.53 INFO ==> Subscribe to project updates by watching https://github.com/bitnami/containers
valkey-cluster 12:37:01.53 INFO ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
valkey-cluster 12:37:01.53 INFO ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
valkey-cluster 12:37:01.53 INFO ==>
valkey-cluster 12:37:01.54 INFO ==> ** Starting Valkey setup **
valkey-cluster 12:37:01.55 INFO ==> Initializing Valkey
valkey-cluster 12:37:01.55 INFO ==> Setting Valkey config file
Events part :
Events:
---- ------ ---- ---- -------
Warning FailedScheduling 6m50s default-scheduler 0/1 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
Normal Scheduled 6m48s default-scheduler Successfully assigned default/valkey-sample-1 to north
Warning FailedMount 6m47s kubelet MountVolume.SetUp failed for volume "scripts" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 6m47s kubelet MountVolume.SetUp failed for volume "valkey-conf" : failed to sync configmap cache: timed out waiting for the condition
Normal Pulling 6m46s kubelet Pulling image "docker.io/bitnami/valkey-cluster:7.2.6-debian-12-r0"
Normal Pulled 6m8s kubelet Successfully pulled image "docker.io/bitnami/valkey-cluster:7.2.6-debian-12-r0" in 37.931s (37.931s including waiting). Image size: 172961168 bytes.
Normal Created 6m8s kubelet Created container valkey
Normal Started 6m8s kubelet Started container valkey
Warning Unhealthy 5m41s (x5 over 6m1s) kubelet Liveness probe failed: Could not connect to Valkey at localhost:6379: Connection refused
Normal Killing 5m41s kubelet Container valkey failed liveness probe, will be restarted
Normal Pulled 5m11s kubelet Container image "docker.io/bitnami/valkey-cluster:7.2.6-debian-12-r0" already present on machine
Warning Unhealthy 106s (x57 over 6m1s) kubelet Readiness probe failed: Could not connect to Valkey at localhost:6379: Connection refused
interesting, what's the output of
k logs valkey-sample-1 -f
its :
❯ k logs valkey-sample-1 -f
valkey-cluster 21:33:08.52 INFO ==>
valkey-cluster 21:33:08.52 INFO ==> Welcome to the Bitnami valkey-cluster container
valkey-cluster 21:33:08.52 INFO ==> Subscribe to project updates by watching https://github.com/bitnami/containers
valkey-cluster 21:33:08.52 INFO ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
valkey-cluster 21:33:08.52 INFO ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
valkey-cluster 21:33:08.52 INFO ==>
valkey-cluster 21:33:08.52 INFO ==> ** Starting Valkey setup **
valkey-cluster 21:33:08.54 INFO ==> Initializing Valkey
valkey-cluster 21:33:08.57 INFO ==> Setting Valkey config file
hmmm, the PV hack may not work with Macs, it might be a good time to try and build out the root-ful mode
Not clear how would I build in root-ful mode.
I did a little research and went with an initContainer to set the PVC file permissions, changes have been released to v0.0.8 and can be used like so:
apiVersion: hyperspike.io/v1
kind: Valkey
metadata:
name: keyval
spec:
volumePermissions: true
to leverage an existing deployment you will need to delete all vk
deployments and upgrade the controller:
kubectl apply -f https://raw.githubusercontent.com/hyperspike/valkey-operator/main/dist/install.yaml
I pulled the latest from git.
Deleted the north namespace/minikube profile minikube delete -p north
Noticed that valkey.yaml in the root looks similar to you have mentioned.
❯ cat valkey.yaml apiVersion: hyperspike.io/v1 kind: Valkey metadata: labels: app.kubernetes.io/name: valkey-operator app.kubernetes.io/managed-by: kustomize name: keyval spec: volumePermissions: true
Also applied the kubectl apply -f https://raw.githubusercontent.com/hyperspike/valkey-operator/main/dist/install.yaml
I am still facing the same issue on this machine.Logs are same.
From events:
Normal Pulling 68m kubelet Pulling image "docker.io/bitnami/valkey-cluster:7.2.6-debian-12-r0"
Normal Pulled 67m kubelet Successfully pulled image "docker.io/bitnami/valkey-cluster:7.2.6-debian-12-r0" in 11.405s (59.853s including waiting). Image size: 172961168 bytes.
Normal Created 67m kubelet Created container valkey
Normal Started 67m kubelet Started container valkey
Normal Killing 66m kubelet Container valkey failed liveness probe, will be restarted
Warning Unhealthy 7m21s (x17 over 67m) kubelet Liveness probe failed: Could not connect to Valkey at localhost:6379: Connection refused
Warning Unhealthy 2m20s (x79 over 67m) kubelet Readiness probe failed: Could not connect to Valkey at localhost:6379: Connection refused
From the controller logs if it helps
2024-08-07T17:09:23Z ERROR failed to create valkey client {"controller": "valkey", "controllerGroup": "hyperspike.io", "controllerKind": "Valkey", "Valkey": {"name":"keyval","namespace":"default"}, "namespace": "default", "name": "keyval", "reconcileID": "bda25c86-fadd-4b6b-ad94-e649576edfc5", "valkey": "keyval", "namespace": "default", "error": "dial tcp: lookup keyval-0.keyval-headless.default.svc: i/o timeout"}
hyperspike.io/valkey-operator/internal/controller.(*ValkeyReconciler).balanceNodes
internal/controller/valkey_controller.go:702
hyperspike.io/valkey-operator/internal/controller.(*ValkeyReconciler).Reconcile
internal/controller/valkey_controller.go:160
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.4/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.4/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.4/pkg/internal/controller/controller.go:261
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
Hmmmm, I wonder if the liveness and readiness probes are simply expiring before the daemon comes up. Can you try bumping the failure threshold from 5 to 25?
Tried with following for both LivenessProbe and ReadinessProbe
InitialDelaySeconds=30s
FailureThreshold: 25,
no changes in the results.
@arpan57 are we good to close?
I am going to try it on k8s cluster instead of minikube. TBH it has worked on one of my personal macbooks, but not other. I would shelf the issue for now. Thank you for all the follow up.
Thank you for the initiative first of all.
Here are the steps I have followed.
git clone https://github.com/hyperspike/valkey-operator.git cd valkey-operator make docker-build make install ( customresourcedefinition.apiextensions.k8s.io/valkeys.hyperspike.io created) Stopped my existing minikube session $ make minikube It created a new kubectx - north $ kubectx north It did create the valkey-operator-system namespace and three pods. However, all the three pods do not run/ stuck in FailedScheduling status