zabbix / zabbix-docker

Official Zabbix Dockerfiles
https://www.zabbix.com
GNU Affero General Public License v3.0
2.33k stars 1.36k forks source link

Zabbix-server HA mode in Kubernetes #931

Closed legolego621 closed 2 years ago

legolego621 commented 2 years ago
SUMMARY

HA mode of zabbix-server on Kubernetes cluster not activating

OS / ENVIRONMENT / Used docker-compose files

k8s: v1.22.6 OS: Debian GNU/Linux 11 (bullseye) tag: 6.0.1-ubuntu

CONFIGURATION
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: zabbix-server
  name: zabbix-server
  namespace: zabbix
spec:
  replicas: 2
  selector:
    matchLabels:
      app: zabbix-server
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: zabbix-server
      containers:
        - image: zabbix/zabbix-server-mysql:6.0.1-ubuntu
          imagePullPolicy: IfNotPresent
          name: zabbix-server
          ports:
            - containerPort: 10051
              protocol: TCP
              name: zabbix-trapper
          resources:    
            requests:
              cpu: 50m
              memory: 100Mi
            limits:
              cpu: 2
              memory: 500Mi                  
          startupProbe:
            tcpSocket:
              port: 10051
            failureThreshold: 60
            periodSeconds: 10
          readinessProbe:
            failureThreshold: 5
            tcpSocket:
              port: 10051
            initialDelaySeconds: 10
            periodSeconds: 30
            successThreshold: 1
            timeoutSeconds: 2
          livenessProbe:
            tcpSocket:
              port: 10051
            initialDelaySeconds: 15
            periodSeconds: 20
          env:
            - name: PHP_TZ
              value: "Europe/Moscow"
            - name: MYSQL_USER
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-zbx-user
            - name: MYSQL_PASSWORD
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-zbx-pass
            - name: MYSQL_ROOT_PASSWORD
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-root-pass
            - name: DB_SERVER_HOST
              value: "mysql-server"
            - name: DB_SERVER_PORT
              value: "3306"
            - name: MYSQL_DATABASE
              value: "zabbix"
            - name: ZBX_CACHESIZE
              value: "1024M"
            - name: ZBX_TRENDCACHESIZE
              value: "1024M"
            - name: ZBX_HISTORYCACHESIZE
              value: "2048M"
            - name: ZBX_HISTORYINDEXCACHESIZE
              value: "1024M"
            - name: ZBX_STARTTRAPPERS
              value: "5"
            - name: ZBX_STARTPREPROCESSORS
              value: "10"
            - name: ZBX_STARTDBSYNCERS
              value: "10"
            - name: ZBX_JAVAGATEWAY_ENABLE
              value: "true"
            - name: ZBX_STARTJAVAPOLLERS
              value: "5"
            - name: ZBX_ENABLE_SNMP_TRAPS
              value: "true"
            - name: ZBX_STARTPROXYPOLLERS
              value: "5"
            - name: ZBX_PROXYCONFIGFREQUENCY
              value: "60"

          volumeMounts:
            - mountPath: /var/tmp/
              name: zabbix-server-var-tmp
            - mountPath: /usr/lib/zabbix/alertscripts
              name: zabbix-server-alertscripts
            - mountPath: /usr/lib/zabbix/externalscripts
              name: zabbix-server-externalscripts
            - mountPath: /var/lib/zabbix/mibs
              name: zabbix-server-mibs
STEPS TO REPRODUCE

kubectl apply -f deployment-zabbix-server.yml

EXPECTED RESULTS

I'm expecting to get two zabbix-server nodes that will automatically select active HA and standby mode.

ACTUAL RESULTS

logs from kubernetes of pods.

log of first node (Which one runs first)

     8:20220308:075734.992 Starting Zabbix Server. Zabbix 6.0.1 (revision a80cb13).
     8:20220308:075734.992 ****** Enabled features ******
     8:20220308:075734.992 SNMP monitoring:           YES
     8:20220308:075734.992 IPMI monitoring:           YES
     8:20220308:075734.992 Web monitoring:            YES
     8:20220308:075734.992 VMware monitoring:         YES
     8:20220308:075734.992 SMTP authentication:       YES
     8:20220308:075734.992 ODBC:                      YES
     8:20220308:075734.992 SSH support:               YES
     8:20220308:075734.992 IPv6 support:              YES
     8:20220308:075734.992 TLS support:               YES
     8:20220308:075734.992 ******************************
     8:20220308:075734.992 using configuration file: /etc/zabbix/zabbix_server.conf
     8:20220308:075735.021 current database version (mandatory/optional): 06000000/06000000
     8:20220308:075735.021 required mandatory version: 06000000
   243:20220308:075735.037 starting HA manager
   243:20220308:075735.056 HA manager started in active mode
   ..............
   ..............

log of second node (Which one runs second)

     9:20220308:075830.010 Starting Zabbix Server. Zabbix 6.0.1 (revision a80cb13).
     9:20220308:075830.010 ****** Enabled features ******
     9:20220308:075830.010 SNMP monitoring:           YES
     9:20220308:075830.010 IPMI monitoring:           YES
     9:20220308:075830.010 Web monitoring:            YES
     9:20220308:075830.010 VMware monitoring:         YES
     9:20220308:075830.010 SMTP authentication:       YES
     9:20220308:075830.010 ODBC:                      YES
     9:20220308:075830.010 SSH support:               YES
     9:20220308:075830.010 IPv6 support:              YES
     9:20220308:075830.010 TLS support:               YES
     9:20220308:075830.010 ******************************
     9:20220308:075830.010 using configuration file: /etc/zabbix/zabbix_server.conf
     9:20220308:075830.051 current database version (mandatory/optional): 06000000/06000000
     9:20220308:075830.051 required mandatory version: 06000000
   244:20220308:075830.074 starting HA manager
   244:20220308:075830.091 HA manager started in active mode
   .......
   ........
     244:20220308:075924.078 HA manager has been paused
     9:20220308:075924.078 HA manager error: the server HA registry record has changed ownership
     9:20220308:075924.079 "" node switched to "error" mode
     9:20220308:075924.079 unsupported status -2 received from HA manager
     9:20220308:075924.079 IPC service: Epoll DEL(1) on fd 37 failed. Old events were 2; read change was 2 (del); write change was 0 (none); close change was 0 (none): Invalid argument
double free or corruption (fasttop)
   265:20220308:075924.080 executing housekeeper
legolego621 commented 2 years ago

I resolved problem with running HA mode. I added the parameters in my deployment config.

            # For HA mode
            - name: ZBX_AUTOHANODENAME
              value: "fqdn"
            - name: ZBX_AUTONODEADDRESS
              value: "fqdn"
            - name: ZBX_SERVICEMANAGERSYNCFREQUENCY
              value: "60"
            - name: ZBX_PROBLEMHOUSEKEEPINGFREQUENCY
              value: "60"

I get another problem. My probe configuration is like this:

          startupProbe:
            tcpSocket:
              port: 10051
            failureThreshold: 60
            periodSeconds: 10
          readinessProbe:
            failureThreshold: 5
            tcpSocket:
              port: 10051
            initialDelaySeconds: 10
            periodSeconds: 30
            successThreshold: 1
            timeoutSeconds: 2
          livenessProbe:
            tcpSocket:
              port: 10051
            initialDelaySeconds: 15
            periodSeconds: 20

I have problem with standby pod of zabbix-server. Standby pod do not listen 10051 tcp, so i have problem with readinessProbe and pod with standby zabbix-server do not running. Is it normal? Maybe there is another option that allows you to do a check readinessProbe and another probes? So that the check allows you to track the state of the active and passive nodes of the cluster?

Logs of pod with standby zabbix-server.

     9:20220308:084011.200 Starting Zabbix Server. Zabbix 6.0.1 (revision a80cb13).
     9:20220308:084011.200 ****** Enabled features ******
     9:20220308:084011.200 SNMP monitoring:           YES
     9:20220308:084011.200 IPMI monitoring:           YES
     9:20220308:084011.200 Web monitoring:            YES
     9:20220308:084011.200 VMware monitoring:         YES
     9:20220308:084011.200 SMTP authentication:       YES
     9:20220308:084011.200 ODBC:                      YES
     9:20220308:084011.200 SSH support:               YES
     9:20220308:084011.200 IPv6 support:              YES
     9:20220308:084011.200 TLS support:               YES
     9:20220308:084011.200 ******************************
     9:20220308:084011.200 using configuration file: /etc/zabbix/zabbix_server.conf
     9:20220308:084011.237 current database version (mandatory/optional): 06000000/06000000
     9:20220308:084011.237 required mandatory version: 06000000
   261:20220308:084011.262 starting HA manager
   261:20220308:084011.300 HA manager started in standby mode
     9:20220308:084011.301 "zabbix-server-58d767cbbd-g8647" node started in "standby" mode

kubectl describe pod

Events:
  Type     Reason     Age                 From               Message
  ----     ------     ----                ----               -------
  Normal   Scheduled  15m                 default-scheduler  Successfully assigned zabbix/zabbix-server-58d767cbbd-g8647 to kub-worker1.infradoms.ru
  Normal   Pulled     15m                 kubelet            Container image "zabbix/zabbix-server-mysql:6.0.1-ubuntu" already present on machine
  Normal   Created    15m                 kubelet            Created container zabbix-server
  Normal   Started    15m                 kubelet            Started container zabbix-server
  Warning  Unhealthy  34s (x91 over 15m)  kubelet            Startup probe failed: dial tcp 10.233.65.9:10051: connect: connection refused
dotneft commented 2 years ago

Feel free to remove such conditions, because standby nodes do not listen anything.

legolego621 commented 2 years ago

Feel free to remove such conditions, because standby nodes do not listen anything.

Hi! Thx for answer. If I remove the probes, then I won't be able to monitor the status of all zabbix server pods. Is this the only solution to refuse trials?

dotneft commented 2 years ago

Could you try to use command instead of check port:

zabbix_server -R ha_status
legolego621 commented 2 years ago

Could you try to use command instead of check port:

zabbix_server -R ha_status

This command runs only master node.

zabbix_server -R ha_status
Runtime commands can be executed only in active mode
dotneft commented 2 years ago

what exit code do you have on standby and active node?

legolego621 commented 2 years ago

zabbix_server -R ha_status

I managed to get containers running on the master and backup nodes.

          startupProbe:
            exec:
              command:
                - /bin/sh
                - -c
                - zabbix_server -R ha_status
            failureThreshold: 60
            periodSeconds: 10
          readinessProbe:
            exec:
              command:
                - /bin/sh
                - -c
                - zabbix_server -R ha_status
            failureThreshold: 5
            initialDelaySeconds: 10
            periodSeconds: 30
            successThreshold: 1
            timeoutSeconds: 2
          livenessProbe:
            exec:
              command:
                - /bin/sh
                - -c
                - zabbix_server -R ha_status  
            initialDelaySeconds: 15
            periodSeconds: 20

But there were problems on the frontend in web gui. I have error this:

Connection to Zabbix server "zabbix-server" refused. Possible reasons:
1. Incorrect server IP/DNS in the "zabbix.conf.php";
2. Security environment (for example, SELinux) is blocking the connection;
3. Zabbix server daemon not running;
4. Firewall is blocking TCP connection.
Connection refused

I think the problem is because there are two instances of the server running and one of them is not listening on port 10051.

Tell me, maybe I'm trying to invent a bicycle? Are there ready-made solutions for scaling the Zabbix server? Your deployment file in the repository does not specify a ready-made solution, more precisely, it does not work with scaling the Zabbix server for more than one node.

dotneft commented 2 years ago

Just comment ZBX_SERVER_HOST and ZBX_SERVER_PORT variables for web containers.

dotneft commented 2 years ago

In this case web will take correct / running Zabbix server instance from DB.

legolego621 commented 2 years ago

Just comment ZBX_SERVER_HOST and ZBX_SERVER_PORT variables for web containers.

it is very nice. My cluster runs. I have "Zabbix server is running" and High availability cluster is enabled. But I have the errors in web interface. Is it normal?

Connection to Zabbix server "zabbix-server-ff5664fbc-qrl5w" failed. Possible reasons:
1. Incorrect server IP/DNS in the "zabbix.conf.php";
2. Incorrect DNS server configuration.
php_network_getaddresses: getaddrinfo failed: Name or service not known

My pods name are:

zabbix-server-ff5664fbc-g476z            1/1     Running   0             25m
zabbix-server-ff5664fbc-qrl5w            1/1     Running   0             25m
dotneft commented 2 years ago

check connectivitiy from web-interface container to zabbix-server-ff5664fbc-qrl5w container 10051 / TCP port. Is it available? Is it active node?

legolego621 commented 2 years ago

check connectivitiy from web-interface container to zabbix-server-ff5664fbc-qrl5w container 10051 / TCP port. Is it available? Is it active node?

it is name not resolve by ping/telnet in container zabbix-web-nginx-mysql , but I have info "Zabbix server is running | Yes | zabbix-server-ff5664fbc-g476z:10051" in web interface and my cluster works.

dotneft commented 2 years ago

hmm... in this active node is zabbix-server-ff5664fbc-g476z ?

legolego621 commented 2 years ago

hmm... in this active node is zabbix-server-ff5664fbc-g476z ?

Yes, this nod is active

root@kub-master1:/home/rootvang# kubectl exec -n zabbix zabbix-server-ff5664fbc-g476z -- zabbix_server -R ha_status
Failover delay: 60 seconds
Cluster status:
   #  ID                        Name                      Address                        Status      Last Access
   .........
  11. cl0jj3yvx0001780dp0246ayk zabbix-server-ff5664fbc-g476z zabbix-server-ff5664fbc-g476z:10051 active      1s
  12. cl0jke2pk0001771ftnhf1ulw zabbix-server-ff5664fbc-8g4kj zabbix-server-ff5664fbc-8g4kj:10051 standby     5s
dotneft commented 2 years ago

In this case Zabbix web should trying connect to zabbix-server-ff5664fbc-g476z. Where do you see message about "Connection to Zabbix server "zabbix-server-ff5664fbc-qrl5w" failed. Possible reasons:" ?

legolego621 commented 2 years ago

In this case Zabbix web should trying connect to zabbix-server-ff5664fbc-g476z. Where do you see message about "Connection to Zabbix server "zabbix-server-ff5664fbc-qrl5w" failed. Possible reasons:" ?

image image

image

njguibert commented 2 years ago

Not in HA, but i have the same problem in the ui

imagen

legolego621 commented 2 years ago

Not in HA, but i have the same problem in the ui

imagen

But the zabbix server and web are working? Do you use Kubernetes?

njguibert commented 2 years ago

Yes, in microk8s, But don't know why the zabbix-web can't communicate with the zabbix-agent, i use the kubernetes.yaml of this repo

dotneft commented 2 years ago

zabbix web does not communicate with web container. Zabbix web must communicate with Zabbix server on 10051 port only.

njguibert commented 2 years ago

Well, some progress, i add ZBX_SERVER_HOST enviroment to zabbix-web, and the warning in the dashboard go off imagen

but i can't get availability, imagen

Thanks!

dotneft commented 2 years ago

Allow communication with Zabbix server by subnet if you use HA.

legolego621 commented 2 years ago

In this case Zabbix web should trying connect to zabbix-server-ff5664fbc-g476z. Where do you see message about "Connection to Zabbix server "zabbix-server-ff5664fbc-qrl5w" failed. Possible reasons:" ?

image image

image

Guys, do you have a ready deployment that describes the scaling of the zabbix server? I'm probably doing something wrong.

dotneft commented 2 years ago

are you able to ping / reach mentioned dns name from web container?

legolego621 commented 2 years ago

are you able to ping / reach mentioned dns name from web container?

DNS is not resolve, but ha cluster is running.

legolego621 commented 2 years ago

are you able to ping / reach mentioned dns name from web container?

DNS is not resolve, but ha cluster is running.

I have no other problems other than this alert in web gui

dotneft commented 2 years ago

What system do you use? Vanilla docker and docker compose? If yes, could you show us "docker ps" and "docker network ls"?

legolego621 commented 2 years ago

I use k8s cluster. My deployment was described in first message. I change only

For HA mode

        - name: ZBX_AUTOHANODENAME
          value: "fqdn"
        - name: ZBX_AUTONODEADDRESS
          value: "fqdn"
        - name: ZBX_SERVICEMANAGERSYNCFREQUENCY
          value: "60"
        - name: ZBX_PROBLEMHOUSEKEEPINGFREQUENCY
          value: "60"
dotneft commented 2 years ago

the message does not describe how Zabbix frontend is running. could you share it?

legolego621 commented 2 years ago

the message does not describe how Zabbix frontend is running. could you share it?

I use this config for web

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: zabbix-web-nginx-mysql
  name: zabbix-web-nginx-mysql
  namespace: zabbix
spec:
  replicas: 2
  selector:
    matchLabels:
      app: zabbix-web-nginx-mysql
  strategy:
    rollingUpdate:
      maxSurge: 1               
      maxUnavailable: 1        
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: zabbix-web-nginx-mysql
      containers:
        - image: zabbix/zabbix-web-nginx-mysql:6.0.1-ubuntu
          imagePullPolicy: IfNotPresent
          name: zabbix-web-nginx-mysql
          ports:
            - containerPort: 8080
              protocol: TCP
            - containerPort: 8443
              protocol: TCP
          resources:    
            requests:
              cpu: 50m
              memory: 100Mi
            limits:
              cpu: 1
              memory: 500Mi                   
          readinessProbe:
            failureThreshold: 5
            tcpSocket:
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 30
            successThreshold: 1
            timeoutSeconds: 2
          livenessProbe:
            tcpSocket:
              port: 8080
            failureThreshold: 3
            initialDelaySeconds: 15
            periodSeconds: 20
          env:
            - name: MYSQL_USER
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-zbx-user
            - name: MYSQL_PASSWORD
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-zbx-pass
            - name: MYSQL_ROOT_PASSWORD
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-root-pass
            - name: MYSQL_DATABASE
              value: "zabbix"
            - name: DB_SERVER_HOST
              value: "mysql-server"
            - name: ZBX_SERVER_NAME
              value: "MONITOR"
dotneft commented 2 years ago

I think you need configure "Route" / "Service" to reach Zabbix server from frontend side: https://kubernetes.io/docs/concepts/services-networking/service/

legolego621 commented 2 years ago

I think you need configure "Route" / "Service" to reach Zabbix server from frontend side: https://kubernetes.io/docs/concepts/services-networking/service/

If I use the route to the service, via service, I get the following error:

Connection to Zabbix server "zabbix-server" refused. Possible reasons:
1. Incorrect server IP/DNS in the "zabbix.conf.php";
2. Security environment (for example, SELinux) is blocking the connection;
3. Zabbix server daemon not running;
4. Firewall is blocking TCP connection.
Connection refused

This is due to the fact that one of the Zabbix server instances is not listening on port 10051 (backup node). Because of this, I left this method.

legolego621 commented 2 years ago

Have you checked the scaling of the Zabbix server in the laboratory? Maybe you have a ready deployment?

dotneft commented 2 years ago

Currently we do not have final solution, I tested haproxy on local installation with vanilla docker, it works fine :-)

legolego621 commented 2 years ago

Currently we do not have final solution, I tested haproxy on local installation with vanilla docker, it works fine :-)

It is nice)) I will try to find solution of the problem. If I find the solving, I will write to here

dotneft commented 2 years ago

So the main goal is:

  1. You have correct DNS name. So Zabbix DB knows about current active node;
  2. You need allow connection between Zabbix frontend container to current active Zabbix server node.
dotneft commented 2 years ago

But to serve connections outside of K8S installation you need to have some LB (load balancer) which can understand what node is active at this moment. In this case you can use haproxy.

legolego621 commented 2 years ago

But to serve connections outside of K8S installation you need to have some LB (load balancer) which can understand what node is active at this moment. In this case you can use haproxy.

Thank you very much for your time.

As far as I understand, kubernetes does not allow you to resolve the hostname of a pod. Specifying hostname or fqdn in the description of the zabbix configuration server deployment file does not help.

        -name: ZBX_AUTOHANODENAME
          value: "fqdn or hostname"
        -name: ZBX_AUTONODEADDRESS
          value: "fqdn or hostname"

The zabbix server web server automatically selects the active node from the database if ZBX_SERVER_HOST is not specified to it (this is the only way to start HA in a kubernetes cluster with more than one node). If you specify ZBX_SERVER_HOST in the web container, then we will get a situation where port 10051 does not work on one of the nodes, because She is in standby mode. The only easy option I see is to add the ability to specify ZBX_AUTOHANODENAME and ZBX_AUTONODEADDRESS as ip.

Am I thinking in the right direction? If so, is there any possibility of waiting for an update from your side?

dotneft commented 2 years ago

Are you able to reach IP of active Zabbix server from web containers?

legolego621 commented 2 years ago

Are you able to reach IP of active Zabbix server from web containers?

Yes. I checked this version. When i define ZBX_SERVER_HOST is ip address active node of zabbix server, it is working. This does not work with hostname of active node of zabbix server.

dotneft commented 2 years ago

I think you need to check why DNS does not work in this case: https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/

legolego621 commented 2 years ago

I think you need to check why DNS does not work in this case: https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/

My dns in the cluster works correctly. I think the problem is that in the database the address of the pod is the hostname and kubernetes can't just resolve the hostname of the pod because https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pods Hostname of pod for resolving is pod-ip-address.service-name.my-namespace.svc.cluster-domain.example

shing6326 commented 2 years ago

try the following approach https://stackoverflow.com/questions/69577282/pod-name-resolution-for-statefulset-doesnt-work

  1. create a headless svc and use statefulset for the zabbix-server
  2. set ZBX_AUTOHANODENAME ZBX_AUTONODEADDRESS to fqdn
  3. remove ZBX_SERVER_HOST and ZBX_SERVER_PORT in zabbix-web

image

legolego621 commented 2 years ago

try the following approach https://stackoverflow.com/questions/69577282/pod-name-resolution-for-statefulset-doesnt-work

  1. create a headless svc and use statefulset for the zabbix-server
  2. set ZBX_AUTOHANODENAME ZBX_AUTONODEADDRESS to fqdn
  3. remove ZBX_SERVER_HOST and ZBX_SERVER_PORT in zabbix-web

image

Thank you so much! I was resolve my problem! It worked! I will describe my deployment file for any people, with my problem.

---
apiVersion: v1
kind: Namespace
metadata:
  name: zabbix
  labels:
    name: zabbix

---
apiVersion: v1
kind: List
metadata:
  name: mysql-secret
  namespace: zabbix
items:
  - apiVersion: v1
    kind: Secret
    type: Opaque
    metadata:
      name: mysql-secret
      namespace: zabbix
    data:
      mysql-root-pass: "**"
      mysql-zbx-user: "**"
      mysql-zbx-pass: "**"

---
apiVersion: v1
kind: Service
metadata:
  name: zabbix-server
  namespace: zabbix
spec:
  ports:
    - port: 10051
      protocol: TCP
      targetPort: zabbix-trapper
      nodePort: 32051
  selector:
    app: zabbix-server
  sessionAffinity: None
  type: NodePort

---
apiVersion: v1
kind: Service
metadata:
  name: zabbix-web-nginx-mysql
  namespace: zabbix
spec:
  ports:
    - port: 8080
      protocol: TCP
      nodePort: 32080
  selector:
    app: zabbix-web-nginx-mysql
  sessionAffinity: None
  type: NodePort

---
apiVersion: v1
kind: Service
metadata:
  name: mysql-server
  labels:
    app: mysql-server
  namespace: zabbix
spec:
  ports:
  - port: 3306
    protocol: TCP
    targetPort: 3306
    nodePort: 32006
  selector:
    app: mysql-server
  sessionAffinity: None
  type: NodePort

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: zabbix-server
  name: zabbix-server
  namespace: zabbix
spec:
  serviceName: "zabbix-server"
  replicas: 2
  selector:
    matchLabels:
      app: zabbix-server
  template:
    metadata:
      labels:
        app: zabbix-server
    spec:
      containers:
        - image: zabbix/zabbix-server-mysql:6.0.1-ubuntu
          imagePullPolicy: IfNotPresent
          name: zabbix-server
          ports:
            - containerPort: 10051
              protocol: TCP
              name: zabbix-trapper
          resources:    
            requests:
              cpu: 50m
              memory: 100Mi
            limits:
              cpu: 2
              memory: 500Mi                  
          startupProbe:
            exec:
              command:
                - /bin/sh
                - -c
                - zabbix_server -R ha_status
            failureThreshold: 60
            periodSeconds: 10
          readinessProbe:
            exec:
              command:
                - /bin/sh
                - -c
                - zabbix_server -R ha_status
            failureThreshold: 5
            initialDelaySeconds: 10
            periodSeconds: 30
            successThreshold: 1
            timeoutSeconds: 2
          livenessProbe:
            exec:
              command:
                - /bin/sh
                - -c
                - zabbix_server -R ha_status  
            initialDelaySeconds: 15
            periodSeconds: 20
          env:
            - name: PHP_TZ
              value: "Europe/Moscow"
            - name: MYSQL_USER
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-zbx-user
            - name: MYSQL_PASSWORD
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-zbx-pass
            - name: MYSQL_ROOT_PASSWORD
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-root-pass
            - name: DB_SERVER_HOST
              value: "mysql-server"
            - name: DB_SERVER_PORT
              value: "3306"
            - name: MYSQL_DATABASE
              value: "zabbix"
            - name: ZBX_CACHESIZE
              value: "1024M"
            - name: ZBX_TRENDCACHESIZE
              value: "1024M"
            - name: ZBX_HISTORYCACHESIZE
              value: "2048M"
            - name: ZBX_HISTORYINDEXCACHESIZE
              value: "1024M"
            - name: ZBX_STARTTRAPPERS
              value: "5"
            - name: ZBX_STARTPREPROCESSORS
              value: "10"
            - name: ZBX_STARTDBSYNCERS
              value: "10"
            - name: ZBX_JAVAGATEWAY_ENABLE
              value: "true"
            - name: ZBX_STARTJAVAPOLLERS
              value: "5"
            - name: ZBX_ENABLE_SNMP_TRAPS
              value: "true"
            - name: ZBX_STARTPROXYPOLLERS
              value: "5"
            - name: ZBX_PROXYCONFIGFREQUENCY
              value: "60"

            # For HA mode
            - name: ZBX_AUTOHANODENAME
              value: "fqdn"
            - name: ZBX_AUTONODEADDRESS
              value: "fqdn"
            - name: ZBX_SERVICEMANAGERSYNCFREQUENCY
              value: "10"
            - name: ZBX_PROBLEMHOUSEKEEPINGFREQUENCY
              value: "60"
          volumeMounts:
            - mountPath: /var/tmp/
              name: zabbix-server-var-tmp
            - mountPath: /usr/lib/zabbix/alertscripts
              name: zabbix-server-alertscripts
            - mountPath: /usr/lib/zabbix/externalscripts
              name: zabbix-server-externalscripts
            - mountPath: /var/lib/zabbix/mibs
              name: zabbix-server-mibs
      volumes:
        - name: zabbix-server-var-tmp
          persistentVolumeClaim:
            claimName: zabbix-server-var-tmp-pvc
        - name: zabbix-server-alertscripts
          persistentVolumeClaim:
            claimName: zabbix-server-alertscripts-pvc
        - name: zabbix-server-externalscripts
          persistentVolumeClaim:
            claimName: zabbix-server-externalscripts-pvc
        - name: zabbix-server-mibs
          persistentVolumeClaim:
            claimName: zabbix-server-mibs-pvc

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: zabbix-web-nginx-mysql
  name: zabbix-web-nginx-mysql
  namespace: zabbix
spec:
  replicas: 2
  selector:
    matchLabels:
      app: zabbix-web-nginx-mysql
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: zabbix-web-nginx-mysql
    spec:
      containers:
        - image: zabbix/zabbix-web-nginx-mysql:6.0.1-ubuntu
          imagePullPolicy: IfNotPresent
          name: zabbix-web-nginx-mysql
          ports:
            - containerPort: 8080
              protocol: TCP
          resources:    
            requests:
              cpu: 50m   # 1 - одно ядро (1000m/1000m), 1/20 - 50m, 1/10 - 100m
              memory: 100Mi
            limits:
              cpu: 1
              memory: 500Mi                   
          readinessProbe:
            failureThreshold: 5
            tcpSocket:
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 30
            successThreshold: 1
            timeoutSeconds: 2
          livenessProbe:
            tcpSocket:
              port: 8080
            failureThreshold: 3
            initialDelaySeconds: 15
            periodSeconds: 20
          env:
            - name: MYSQL_USER
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-zbx-user
            - name: MYSQL_PASSWORD
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-zbx-pass
            - name: MYSQL_ROOT_PASSWORD
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-root-pass
            - name: MYSQL_DATABASE
              value: "zabbix"
            - name: DB_SERVER_HOST
              value: "mysql-server"
            - name: ZBX_SERVER_NAME
              value: "MONITOR server"

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: mysql-server
  name: mysql-server
  namespace: zabbix
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mysql-server
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: mysql-server
    spec:
      containers:
        - image: mysql:8.0
          imagePullPolicy: IfNotPresent
          name: mysql-server
          ports:
            - containerPort: 3306
              protocol: TCP
              name: mysql
          resources:    
            requests:
              cpu: 50m
              memory: 100Mi
            limits:
              cpu: 1
              memory: 700Mi                   
          readinessProbe:
            failureThreshold: 3
            tcpSocket:
              port: mysql
            initialDelaySeconds: 10
            periodSeconds: 60
            successThreshold: 1
            timeoutSeconds: 2  
          livenessProbe:
            tcpSocket:
              port: mysql
            initialDelaySeconds: 15
            periodSeconds: 20
          args:
            - "--character-set-server=utf8"
            - "--collation-server=utf8_bin"
          env:
            - name: MYSQL_USER
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-zbx-user
            - name: MYSQL_PASSWORD
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-zbx-pass
            - name: MYSQL_ROOT_PASSWORD
              valueFrom:
               secretKeyRef:
                name: mysql-secret
                key: mysql-root-pass
            - name: MYSQL_DATABASE
              value: "zabbix"
          volumeMounts:
            - mountPath: /var/lib/mysql
              name: mysql-server-db
      volumes:
        - name: mysql-server-db
          persistentVolumeClaim:
            claimName: mysql-server-db-pvc
dotneft commented 2 years ago

Excellent! Thank you for the details. That is what I recommend initially )

ranjana990 commented 2 years ago

whether its working with persistentvolumeclaim, for me it was not till i changed it to volumeclaimtemplate in the zabbx-server statefulset. Also even after using service the dns resolution is not working for me.

aurelien12344567 commented 7 months ago

@legolego621 ,How do you connect the proxy when you use a service type NodePort to expose the Zabbix server, I was using the host IP and why do you have these PVC volumes: