zabbix-community / helm-zabbix

Helm chart for Zabbix
https://artifacthub.io/packages/helm/zabbix-community/zabbix
Apache License 2.0
84 stars 48 forks source link

[zabbix-community/zabbix] Move zabbix agent to it's own template #20

Closed MaxDiOrio closed 1 year ago

MaxDiOrio commented 1 year ago

It appears currently that zabbix-agent install is in the zabbix-server template. If zabbix-server is disabled in the values, the agent doesn't seem to get installed.

There are times that I can see people wanting to install only the agent, including myself.

aeciopires commented 1 year ago

Hello @MaxDiOrio!

Thanks for report this problem. Makes sense. I will fix this.

sa-ChristianAnton commented 1 year ago

Hi @MaxDiOrio,

as of now, the Zabbix Agent is deployed together with either the Zabbix server or proxy. In case of a proxy, from my point of view, that does not make any sense at all, but for the server there is exactly one thought behind that I am not sure whether I like it or not: the "Zabbix server" host initialized with a plain new Zabbix installation is actually turning green and monitoring Linux metrics, like CPU, RAM, etc.

That said, I agree with @MaxDiOrio that it would be better design if the agent would be deployed using an own deployment instead, with proper options to add extra configurations, maybe even fixing #22 by that time. This would allow also to just deploy ONLY a Zabbix agent with this chart, which I like a lot, because I believe there is no official chart for that use case.

The downside would be that this "Zabbix server" host inside Zabbix would then not get green availability anymore automatically, and the host would have to be reconfigured (as I usually do every time), at least by changing the DNS name of the host interface to some name of a service.

As I have something in mind for the future to allow this kind of configurations to be done on deployment (interacting with the API of Zabbix in deployment phase), I would suggest to keep the "sidecar agent" for a while as an additional container, with almost no configuration options at all, maybe a bit less than it does have now, and move the entire "zabbixagent" part of the values.yaml to configure an entirely new deployment of the Zabbix Agent.

BGmot commented 1 year ago

What metrics do you monitor with "sidecar agent"? Let alone "sidecar" part, to these days I still question value of deploying an agent in docker container. I have not ever heard any reason, the only valid one is probably that rare case when you technically can't install Zabbix agent on the server/VM (outdated/not supported OS).

PS: feel free to ignore this question as it is not really related to this Issue.

sa-ChristianAnton commented 1 year ago

I agree with you @BGmot that the sidecar agent option is technically not too useful, but it makes the result of a Zabbix installation "look nice", and therefore I would like to keep this option for now, until we have another option to configure Zabbix' initial set up not to expect a Zabbix agent running on 127.0.0.1. There are indeed use cases where an agent in a container makes sense. In fact, in a Kubernetes world you don't want to run the Zabbix agent (in case you DO want to run Zabbix agent on Kubernetes nodes, but that's an entirely different discussion) directly on the nodes but as DaemonSet instead. We also have had cases where we run 100s of Zabbix Proxies and Agents on very tiny Kubernetes instances (K3S on Raspberry-Pi like systems) for distributed monitoring and benefit from the easier management of containerized implementations in large scale (Continuous Deployment). Another good use cases for a Zabbix Agent in a container is the "additional monitoring capabilities" that the Agent2 offers: MQTT, Certificate monitoring, etc. which makes the Agent act as a kind of a Zabbix proxy. These use cases I like to implement by using the above mentioned sidecar method, btw. ...

BGmot commented 1 year ago

Thanks @sa-ChristianAnton ! Indeed with all the plugins functionality added in Agent2 it does make sense to run Zabbix agent in a container.

aeciopires commented 1 year ago

Hi @sa-ChristianAnton!

Could you review this PR https://github.com/zabbix-community/helm-zabbix/pull/25?

I tried to get as close as possible to the idea you described here, but I still kept the agent in sidecar in Zabbix Server and Zabbix Proxy pods. We can remove it if you agree and leave the Zabbix Agent just as a daemonset.

I created a script and a Docker image and put it as initContainer in the zabbixWebService pod to access the web interface via API and change the Zabbix Server address during deployment. It works! :-)

References:

BGmot commented 1 year ago

Hi @aeciopires ! If we do not enable zabbixWebService then updateZabbixServerHostInterface.py will never run? Why don't we put this initContainer in Zabbix Server deployment?

aeciopires commented 1 year ago

Hi @BGmot!

Yes, here we have a cyclic dependency problem... I still don't know how to solve it, but I welcome suggestions...

If I put this initContainer in the Zabbix Web pod, the Zabbix Web interface never appears, because it expects to successfully terminate the initContainer that waits for Zabbix Web.

If I put it in the Zabbix Server pod, it won't work because the Zabbix Web pod is waiting for the Zabbix Server pod to be active, which would be waiting for the initContainer to be active, which depends on Zabbix Web. Do you understand the problem?

I think leaving it in Zabbix Web Service is less bad for now.

An alternative I didn't explore was to leave it as a job, independent of the deployment/statefulset.

BGmot commented 1 year ago

it won't work because the Zabbix Web pod is waiting for the Zabbix Server pod to be active, Hmm... I might be wrong but why Zabbix Web should wait for Zabbix Server? All it needs is a DB up and running.

MaxDiOrio commented 1 year ago

Why do you need an init container? It's using DNS for the agent pod. I think the answer here is the zabbix server needs to expose the agent host name, which defaults to service-name.namespace-name.svc.cluster.local, which can be passed in.

Why can't the agent just be in it's own chart? There are far more times I'd want to deploy the agent only and not have to deal with creating a helm values that disables all the features by the agent. Include the agent chart into the stack chart.

aeciopires commented 1 year ago

Hello @BGmot!

You're right. I explaned wrong... In this helm chart, Zabbix Web is waiting for the database to be completely ready (with the correct tables). But this is done by one of the Zabbix Server pod's initContainers. So, consequently, when Zabbix Server goes up, Zabbix Web goes up. But the problem of cyclic dependency goes beyond this, as I explained earlier.

Hello @MaxDiOrio!

1- There are several ways to solve a problem... I chose to implement initContainer because I agreed with @sa-ChristianAnton's idea

2- I don't like the idea of ​​creating yet another specific helm chart to install zabbix-agent.

This will generate more work for those who continuously maintain the evolution of the helm chart and will give more work for those who want to install the Zabbix Agent together with the other components and leave everything integrated.

I don't think it's a big problem to put false in certain parameters to disable the installation of other components and leave only the Zabbix Agent enabled. You don't need to copy the entire values.yaml file and change many lines... You can achieve this with the following values using the new version of chart (4.0.0 in progress).

# Custom values for zabbix.

zabbixImageTag: 6.2.6-alpine

postgresAccess:
  useUnifiedSecret: false
  unifiedSecretAutoCreate: false

zabbixServer:
  enabled: false

postgresql:
  enabled: false

zabbixProxy:
  enabled: false

zabbixAgent:
  enabled: true
  ZBX_HOSTNAME: zabbix-agent
  ZBX_SERVER_HOST: 0.0.0.0/0
  ZBX_SERVER_PORT: 10051
  ZBX_PASSIVE_ALLOW: true # This variable is boolean (true or false) and enables or disables feature of passive checks. By default, value is true
  ZBX_PASSIVESERVERS: "0.0.0.0/0,10.244.1.0/24" # The variable is comma separated list of allowed Zabbix server or proxy hosts for connections to Zabbix agent container.
  ZBX_ACTIVE_ALLOW: false # This variable is boolean (true or false) and enables or disables feature of active checks
  ZBX_DEBUGLEVEL: 3 # The variable is used to specify debug level, from 0 to 5
  ZBX_TIMEOUT: 4 # The variable is used to specify timeout for processing checks. By default, value is 4.
  ZBX_VMWARECACHESIZE: 128M
  service:
    type: ClusterIP
    port: 10050
  extraEnv:
    - name: "ZBX_EXAMPLE_MY_ENV_7"
      value: "true"
    - name: "ZBX_EXAMPLE_MY_ENV_8"
      value: "false"
    - name: "ZBX_EXAMPLE_MY_ENV_9"
      value: "100"

zabbixWeb:
  enabled: false

zabbixWebService:
  enabled: false

ingress:
  enabled: false

karpenter:
  enabled: false

Also, there are other helm charts on the Internet that you can use to install Zabbix Agent only or you can create it just for your purpose and offer it to the community.

Please do not misunderstand me. I'm not being rude, I'm just trying to explain my point of view as someone who reserves some of his free time to solve some demands of different people and environments around the world. I will always go for approaches that involve less work to test or maintain, even if the code gets complex, when necessary.

BGmot commented 1 year ago

In this helm chart, Zabbix Web is waiting for the database to be completely ready (with the correct tables). But this is done by one of the Zabbix Server pod's initContainers. So, consequently, when Zabbix Server goes up, Zabbix Web goes up. But the problem of cyclic dependency goes beyond this, as I explained earlier.

Maybe I explained it wrong... What I mean is you add one more init container updateZabbixServerHostInterface.py to Zabbix Server deployment, then what happens is:

What am I missing?

aeciopires commented 1 year ago

Hi @BGmot!

Thanks for explaining your idea better. Now it's clear and I'm going to test this approach.

aeciopires commented 1 year ago

Hi @BGmot!

I changed the implementation. Thanks!

BGmot commented 1 year ago

You are fast! Thanks! -)

aeciopires commented 1 year ago

Hi @MaxDiOrio

I removed the init container in my pull request https://github.com/zabbix-community/helm-zabbix/pull/25 after talk with @sa-ChristianAnton, but keeped the option of installation Zabbix Agent as sidecar.