minio / operator

Simple Kubernetes Operator for MinIO clusters :computer:
https://min.io/docs/minio/kubernetes/upstream/index.html
GNU Affero General Public License v3.0
1.22k stars 454 forks source link

Tenant MinIO instance changing svc ports #968

Closed jiribaloun closed 2 years ago

jiribaloun commented 2 years ago

Currently, I'm running a small K8s cluster (deployed using kubekey) in tenant on OpenStack with LBaaS integrated. When I'm going to expose tenant minio endpoint (by using kind: Loadbalancer), service is being exposed, however the internal service port is changing every minute. That causes OpenStack LBaaS reconfiguration and consequently, service outage.

[root@czbrn-skym005 ~]# kubectl get svc -n sky-minio -o wide --watch NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR minio LoadBalancer 10.233.13.44 10.170.163.151 443:32114/TCP 2m53s v1.min.io/tenant=sky-minio sky-minio-console LoadBalancer 10.233.49.155 9443:31743/TCP 2m52s v1.min.io/tenant=sky-minio sky-minio-hl ClusterIP None 9000/TCP 2m52s v1.min.io/tenant=sky-minio sky-minio-log-hl-svc ClusterIP None 5432/TCP 57s v1.min.io/log-pg=sky-minio-log sky-minio-log-search-api ClusterIP 10.233.24.243 8080/TCP 57s v1.min.io/logsearchapi=sky-minio-log-search-api minio LoadBalancer 10.233.13.44 10.170.163.151 443:30207/TCP 2m56s v1.min.io/tenant=sky-minio sky-minio-console LoadBalancer 10.233.49.155 9443:32765/TCP 2m55s v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.13.44 10.170.163.151 443:31705/TCP 3m56s v1.min.io/tenant=sky-minio sky-minio-console LoadBalancer 10.233.49.155 9443:30923/TCP 3m55s v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.13.44 10.170.163.151 443:30725/TCP 4m56s v1.min.io/tenant=sky-minio sky-minio-console LoadBalancer 10.233.49.155 9443:31363/TCP 4m55s v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.13.44 10.170.163.151 443:30734/TCP 5m57s v1.min.io/tenant=sky-minio sky-minio-console LoadBalancer 10.233.49.155 9443:31318/TCP 5m56s v1.min.io/tenant=sky-minio

Expected Behavior

I'd expect that once the tenant cluster is deployed, and the internal service port specified, it will remain configured until it is destroyed.

Current Behavior

Internal SVC port in tenant cluster is changing every minute, which causes load balancer reconfiguration.

Possible Solution

Steps to Reproduce (for bugs)

  1. Using kubekey for k8s deployment K8s v1.19.9 Docker 20.10.8 Centos 7.9.2009

  2. kubectl krew version OPTION VALUE GitTag v0.4.2 GitCommit 6fcdb79 IndexURI https://github.com/kubernetes-sigs/krew-index.git BasePath /root/.krew IndexPath /root/.krew/index/default InstallPath /root/.krew/store BinPath /root/.krew/bin DetectedPlatform linux/amd64

  3. Latest version of minio-operator (image: minio/operator:v4.4.3), latest console (minio/console:v0.13.2)

  4. Latest version of tenant image (minio/minio:RELEASE.2022-01-08T03-11-54Z)

Context

Regression

Your Environment

dvaldivia commented 2 years ago

we don't support changing the port of the default services, for that I'd recommend you setup additional services with the same selector as the default services.

what could be happening is that we are setting the service to be of type load balancer every minute perhaps, and this on your setup is causing the port to be re-configured, will try to replicate on my kind setup

dvaldivia commented 2 years ago

is anything adding annotations or labels to the console service @jiribaloun ? if that is the case the operator would be reverting the change every minute

jiribaloun commented 2 years ago

Hi Daniel,

Nothing in particular. It’s the default installation of the operator.

Jiri Baloun HPE Pointnext Services

+420724152474 Mobile

Prague, Czech Republic hpe.com

[HPE logo]http://www.hpe.com/

From: Daniel Valdivia @.> Sent: čtvrtek 13. ledna 2022 19:13 To: minio/operator @.> Cc: Baloun, Jiri @.>; Mention @.> Subject: Re: [minio/operator] Tenant MinIO instance changing svc ports (Issue #968)

is anything adding annotations or labels to the console service @jiribalounhttps://github.com/jiribaloun ? if that is the case the operator would be reverting the change every minute

— Reply to this email directly, view it on GitHubhttps://github.com/minio/operator/issues/968#issuecomment-1012386389, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM7AI4EABTMHFN26OH3JCK3UV4I2JANCNFSM5L4PJCIQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.Message ID: @.**@.>>

thijs-s commented 2 years ago

We have noticed the same behaviour after updating the minio operator from v4.0.x to v4.4.3. The operator changes the NodePort of console service within the tenant namespace, but it is more random than every minute.

dvaldivia commented 2 years ago

@thijs-s are you using upstream kubernetes? what version?

dvaldivia commented 2 years ago

I do observe the behavior, I'll look for a fix

➜ k -n ns-1 get svc --watch
NAME                           TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
minio                          LoadBalancer   10.96.232.116   <pending>     443:32469/TCP    3d18h
new-tenant-console             LoadBalancer   10.96.196.184   <pending>     9443:32683/TCP   3d18h
new-tenant-hl                  ClusterIP      None            <none>        9000/TCP         3d18h
new-tenant-log-hl-svc          ClusterIP      None            <none>        5432/TCP         3d18h
new-tenant-log-search-api      ClusterIP      10.96.218.169   <none>        8080/TCP         3d18h
new-tenant-prometheus-hl-svc   ClusterIP      None            <none>        9090/TCP         3d18h
minio                          LoadBalancer   10.96.232.116   <pending>     443:31785/TCP    3d18h
new-tenant-console             LoadBalancer   10.96.196.184   <pending>     9443:30940/TCP   3d18h
minio                          LoadBalancer   10.96.232.116   <pending>     443:30659/TCP    3d18h
new-tenant-console             LoadBalancer   10.96.196.184   <pending>     9443:30604/TCP   3d18h
minio                          LoadBalancer   10.96.232.116   <pending>     443:32586/TCP    3d18h
new-tenant-console             LoadBalancer   10.96.196.184   <pending>     9443:32304/TCP   3d18h
minio                          LoadBalancer   10.96.232.116   <pending>     443:32659/TCP    3d18h
new-tenant-console             LoadBalancer   10.96.196.184   <pending>     9443:32561/TCP   3d18h
minio                          LoadBalancer   10.96.232.116   <pending>     443:30723/TCP    3d18h
new-tenant-console             LoadBalancer   10.96.196.184   <pending>     9443:31418/TCP   3d18h
minio                          LoadBalancer   10.96.232.116   <pending>     443:31026/TCP    3d18h
new-tenant-console             LoadBalancer   10.96.196.184   <pending>     9443:32538/TCP   3d18h
minio                          LoadBalancer   10.96.232.116   <pending>     443:31465/TCP    3d18h
new-tenant-console             LoadBalancer   10.96.196.184   <pending>     9443:31966/TCP   3d18h
minio                          LoadBalancer   10.96.232.116   <pending>     443:31668/TCP    3d18h
new-tenant-console             LoadBalancer   10.96.196.184   <pending>     9443:30823/TCP   3d18h
minio                          LoadBalancer   10.96.232.116   <pending>     443:31660/TCP    3d18h
new-tenant-console             LoadBalancer   10.96.196.184   <pending>     9443:32725/TCP   3d18h
minio                          LoadBalancer   10.96.232.116   <pending>     443:30004/TCP    3d18h
new-tenant-console             LoadBalancer   10.96.196.184   <pending>     9443:32127/TCP   3d18h
jiribaloun commented 2 years ago

Hello there,

Has been bug #968 merged to upstream? I've deployed minio operator, ver 4.4.4 on the same environment as above. Then I got error: I group "coordination.k8s.io" in the namespace "minio-operator" E0131 12:30:52.449407 1 leaderelection.go:325] error retrieving resource lock minio-operator/minio-operator-lock: leases.coordination.k8s.io "minio-operator-lock" is forbidden: User "system:serviceaccount:minio-operator:minio-operator" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "minio-operator" E0131 12:30:59.846251 1 leaderelection.go:325] error retrieving resource lock minio-operator/minio-operator-lock: leases.coordination.k8s.io "minio-operator-lock" is forbidden: User "system:serviceaccount:minio-operator:minio-operator" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "minio-operator" E0131 12:31:10.810290 1 leaderelection.go:325] error retrieving resource lock minio-operator/minio-operator-lock: leases.coordination.k8s.io "minio-operator-lock" is forbidden: User "system:serviceaccount:minio-operator:minio-operator" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "minio-operator" E0131 12:31:21.652532 1 leaderelection.go:325] error retrieving resource lock minio-operator/minio-operator-lock: leases.coordination.k8s.io "minio-operator-lock" is forbidden: User "system:serviceaccount:minio-operator:minio-operator" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "minio-operator" E0131 12:31:27.069229 1 leaderelection.go:325] error retrieving resource lock minio-operator/minio-operator-lock: leases.coordination.k8s.io "minio-operator-lock" is forbidden: User "system:serviceaccount:minio-operator:minio-operator" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "minio-operator" E0131 12:31:32.660209 1 leaderelection.go:329] error initially creating leader election record: leases.coordination.k8s.io is forbidden: User "system:serviceaccount:minio-operator:minio-operator" cannot create resource "leases" in API group "coordination.k8s.io" in the namespace "minio-operator" E0131 12:31:40.938377 1 leaderelection.go:329] error initially creating leader election record: leases.coordination.k8s.io is forbidden: User "system:serviceaccount:minio-operator:minio-operator" cannot create resource "leases" in API group "coordination.k8s.io" in the namespace "minio-operator" E0131 12:31:47.265964 1 leaderelection.go:329] error initially creating leader election record: leases.coordination.k8s.io is forbidden: User "system:serviceaccount:minio-operator:minio-operator" cannot create resource "leases" in API group "coordination.k8s.io" in the namespace "minio-operator" E0131 12:31:56.374432 1 leaderelection.go:329] error initially creating leader election record: leases.coordination.k8s.io is forbidden: User "system:serviceaccount:minio-operator:minio-operator" cannot create resource "leases" in API group "coordination.k8s.io" in the namespace "minio-operator" E0131 12:32:05.044948 1 leaderelection.go:329] error initially creating leader election record: leases.coordination.k8s.io is forbidden: User "system:serviceaccount:minio-operator:minio-operator" cannot create resource "leases" in API group "coordination.k8s.io" in the namespace "minio-operator" E0131 12:32:12.409416 1 leaderelection.go:329] error initially creating leader election record: leases.coordination.k8s.io is forbidden: User "system:serviceaccount:minio-operator:minio-operator" cannot create resource "leases" in API group "coordination.k8s.io" in the namespace "minio-operator" E0131 12:32:20.520166 1 leaderelection.go:329] error initially creating leader election record: leases.coordination.k8s.io is forbidden: User "system:serviceaccount:minio-operator:minio-operator" cannot create resource "leases" in API group "coordination.k8s.io" in the namespace "minio-operator" E0131 12:32:29.210569 1 leaderelection.go:329] error initially creating leader election record: leases.coordination.k8s.io is forbidden: User "system:serviceaccount:minio-operator:minio-operator" cannot create resource "leases" in API group "coordination.k8s.io" in the namespace "minio-operator" I0131 12:32:39.073340 1 leaderelection.go:253] successfully acquired lease minio-operator/minio-operator-lock I0131 12:32:39.073661 1 main-controller.go:458] minio-operator-74c7cd57bc-bhwnr: I've become the leader I0131 12:32:39.073897 1 main-controller.go:373] Waiting for API to start E0131 12:32:39.076797 1 main-controller.go:472] failed to patch operator leader pod: pods "minio-operator-74c7cd57bc-bhwnr" is forbidden: User "system:serviceaccount:minio-operator:minio-operator" cannot patch resource "pods" in API group "" in the namespace "minio-operator"

The Tenant has been deployed, when ClusterRole was modified.

However, when tenant service is changed to LoadBalancer, then errors popup: I0131 13:07:34.629408 1 status.go:180] Hit conflict issue, getting latest version of tenant I0131 13:07:44.904638 1 status.go:180] Hit conflict issue, getting latest version of tenant I0131 13:08:00.355087 1 status.go:180] Hit conflict issue, getting latest version of tenant I0131 13:08:15.696067 1 status.go:180] Hit conflict issue, getting latest version of tenant I0131 13:08:30.760712 1 minio-services.go:66] Services don't match: Service type doesn't match I0131 13:08:31.209631 1 status.go:180] Hit conflict issue, getting latest version of tenant I0131 13:08:35.874300 1 minio-services.go:66] Services don't match: Service type doesn't match I0131 13:08:41.234319 1 minio-services.go:66] Services don't match: Service type doesn't match I0131 13:08:41.614063 1 status.go:180] Hit conflict issue, getting latest version of tenant

Also, port changing timeout has been increased:

$ kubectl get svc -n sky-minio -o wide --watch NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR minio LoadBalancer 10.233.31.70 443:30859/TCP 35m v1.min.io/tenant=sky-minio sky-minio-console ClusterIP 10.233.47.111 9443/TCP 35m v1.min.io/tenant=sky-minio sky-minio-hl ClusterIP None 9000/TCP 35m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:30616/TCP 35m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:32381/TCP 35m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:31950/TCP 35m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:32119/TCP 35m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:32582/TCP 35m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:32648/TCP 35m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:30858/TCP 35m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:30746/TCP 36m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:32396/TCP 36m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:32265/TCP 36m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:31822/TCP 36m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:31557/TCP 36m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:30045/TCP 36m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:32449/TCP 36m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:31330/TCP 36m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 443:32399/TCP 36m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 10.170.163.146 443:32399/TCP 36m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 10.170.163.146 443:30620/TCP 36m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 10.170.163.146 443:31658/TCP 37m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 10.170.163.146 443:32344/TCP 37m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 10.170.163.146 443:30206/TCP 37m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 10.170.163.146 443:31534/TCP 37m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 10.170.163.146 443:30897/TCP 37m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 10.170.163.146 443:32142/TCP 37m v1.min.io/tenant=sky-minio minio LoadBalancer 10.233.31.70 10.170.163.146 443:32659/TCP 37m v1.min.io/tenant=sky-minio

Am I missing something to fix that issue?