RedisLabs / redis-enterprise-k8s-docs

151 stars 89 forks source link

Can't start cluster in GKE Autopilot cluster #268

Open ganochenkodg opened 10 months ago

ganochenkodg commented 10 months ago

I faced with some issues when try to start Redis Enterprise cluster in the Google GKE Autopilot cluster.

At first after applyiing the bundle (version 7.2.4.2) i got such errors at the beginning:

{"level":"info","ts":"2023-09-11T07:50:02.062Z","logger":"controller_redisenterprisecluster","msg":"Failed to create stateful set: admission webhook \"warden-validating.common-webhooks.networking.gke.io\" denied the request: GKE Warden rejected the request because it violates one or more constraints.\nViolations details: {\"[denied by autogke-default-linux-capabilities]\":[\"linux capability 'SYS_RESOURCE' on container 'redis-enterprise-node' not allowed; Autopilot only allows the capabilities: 'AUDIT_WRITE,CHOWN,DAC_OVERRIDE,FOWNER,FSETID,KILL,MKNOD,NET_BIND_SERVICE,NET_RAW,SETFCAP,SETGID,SETPCAP,SETUID,SYS_CHROOT,SYS_PTRACE'.\"]}\nRequested by user: 'system:serviceaccount:rec-ns:redis-enterprise-operator', groups: 'system:serviceaccounts,system:serviceaccounts:rec-ns,system:authenticated'."}
{"level":"info","ts":"2023-09-11T07:50:02.062Z","logger":"controller_redisenterprisecluster","msg":"cluster pending creation: error while deploying REC: admission webhook \"warden-validating.common-webhooks.networking.gke.io\" denied the request: GKE Warden rejected the request because it violates one or more constraints.\nViolations details: {\"[denied by autogke-default-linux-capabilities]\":[\"linux capability 'SYS_RESOURCE' on container 'redis-enterprise-node' not allowed; Autopilot only allows the capabilities: 'AUDIT_WRITE,CHOWN,DAC_OVERRIDE,FOWNER,FSETID,KILL,MKNOD,NET_BIND_SERVICE,NET_RAW,SETFCAP,SETGID,SETPCAP,SETUID,SYS_CHROOT,SYS_PTRACE'.\"]}\nRequested by user: 'system:serviceaccount:rec-ns:redis-enterprise-operator', groups: 'system:serviceaccounts,system:serviceaccounts:rec-ns,system:authenticated'.","Request.Namespace":"rec-ns","Request.Name":"gke-rec"}

They i try to apply the simplest manifest:

apiVersion: "app.redislabs.com/v1"
kind: "RedisEnterpriseCluster"
metadata:
  name: gke-rec
spec:
  nodes: 3

And i see the next errors:

 {"level":"info","ts":"2023-09-11T07:53:39.779Z","logger":"controller_redisenterprisecluster","msg":"Failed to create REC service: Service \"gke-rec\" is invalid: spec.clusterIPs: Invalid value: []string{\"10.52.6.82\"}: failed to allocate IP 10.52.6.82: provided IP is already allocated"}
{"level":"info","ts":"2023-09-11T07:53:39.779Z","logger":"controller_redisenterprisecluster","msg":"cluster pending creation: error while deploying REC: Service \"gke-rec\" is invalid: spec.clusterIPs: Invalid value: []string{\"10.52.6.82\"}: failed to allocate IP 10.52.6.82: provided IP is already allocated","Request.Namespace":"rec-ns","Request.Name":"gke-rec"}

If i delete the service manually (as changelog suggests)- i will get the same errors, just with the new IP address:

{"level":"info","ts":"2023-09-11T07:55:29.432Z","logger":"controller_redisenterprisecluster","msg":"Failed to create REC service: Service \"gke-rec\" is invalid: spec.clusterIPs: Invalid value: []string{\"10.52.13.195\"}: failed to allocate IP 10.52.13.195: provided IP is already allocated"}
{"level":"info","ts":"2023-09-11T07:55:29.432Z","logger":"controller_redisenterprisecluster","msg":"cluster pending creation: error while deploying REC: Service \"gke-rec\" is invalid: spec.clusterIPs: Invalid value: []string{\"10.52.13.195\"}: failed to allocate IP 10.52.13.195: provided IP is already allocated","Request.Namespace":"rec-ns","Request.Name":"gke-rec"}

What can i do to fix this behavior or when to expect it will work?