percona / percona-xtradb-cluster-operator

Percona Operator for MySQL based on Percona XtraDB Cluster
https://www.percona.com/doc/kubernetes-operator-for-pxc/index.html
Apache License 2.0
512 stars 184 forks source link

Intermittent issue kubernetes admission percona-xtradbcluster-webhook #1561

Open juliano-secondo opened 7 months ago

juliano-secondo commented 7 months ago

I'm experiencing an intermittent issue where at times I encounter the error described below, while at other times, everything functions normally.

https://percona-xtradb-cluster-operator.percona.svc:443/validate-percona-xtradbcluster?timeout=10s:](https://percona-xtradb-cluster-operator.percona.svc/validate-percona-xtradbcluster?timeout=10s:) tls: failed to verify certificate: x509: certificate signed by unknown

if I remove the kubernetes admission: percona-xtradbcluster-webhook this error never happens again

spron-in commented 7 months ago

Hello @juliano-secondo - it is interesting, but also hard to debug intermittent issues. Can you tell me more about your environment?

I assume you use cluster wide deployment. Anything else that would help us to reproduce the problem?

juliano-secondo commented 7 months ago

I have some clusters running it and all of them that I didn't delete the admission I have this intermittent issue. I'm using fluxcd to deploy pxc db cluster by kustomize code, it runs every 10 minutes and time by time I got the error I sent here. I didn't find any way to reproduce it, I'm openning for any idea to reproduce it.

spron-in commented 7 months ago

@juliano-secondo please share more. Versions, custom resource yamls, how many clusters you have, etc.

juliano-secondo commented 7 months ago

I have 6 clusters all of them using pxc-operator:1.12.1, but I already created a new cluster just for debug this issue and I already tried with latest v1.13.3 and I got the same.

Using custom resource: https://raw.githubusercontent.com/percona/percona-xtradb-cluster-operator/v1.12.0/deploy/crd.yaml and https://raw.githubusercontent.com/percona/percona-xtradb-cluster-operator/v1.13.0/deploy/crd.yaml was used for test on latest version

spron-in commented 7 months ago

Thanks! Can you please also share cr.yaml manifest (as you shared the link to custom resource definition) - it is not the database cluster.

juliano-secondo commented 7 months ago

this one you mean?

affinity: {}
fullnameOverride: ""
image: ""
imagePullPolicy: IfNotPresent
imagePullSecrets: null
nameOverride: ""
nodeSelector: {}
operatorImageRepository: ${default_image_registry}/percona/percona-xtradb-cluster-operator
replicaCount: 2
resources:
  limits:
    cpu: 200m
    memory: 500Mi
  requests:
    cpu: 100m
    memory: 20Mi
tolerations: null
watchAllNamespaces: true
ebuildy commented 2 months ago

this one you mean?

affinity: {}
fullnameOverride: ""
image: ""
imagePullPolicy: IfNotPresent
imagePullSecrets: null
nameOverride: ""
nodeSelector: {}
operatorImageRepository: ${default_image_registry}/percona/percona-xtradb-cluster-operator
replicaCount: 2
resources:
  limits:
    cpu: 200m
    memory: 500Mi
  requests:
    cpu: 100m
    memory: 20Mi
tolerations: null
watchAllNamespaces: true

Remove limits.cpu : 200m , this is too small. I strongly advise to not set limit for "go" program ^^