Closed davidcr01 closed 6 months ago
Several tests were carried out that could be related to the problem generated when starting the Wazuh manager image in Kubernetes:
When checking the pod in operation, it was found that the filebeat.yml file contained the default configuration that comes with installing the application. The error generated by the restart was analyzed and an incompatibility of the add_kubernetes_metadata
processor with AmazonLinux 2023
was found, which caused the restart, but we continued investigating why the filebeat.yml
file that we added in the Dockerfile was not impacted, in addition the tests generated problems we already had an error with modulesd
that caused the pod to consume all the cluster resources and we could not continue, having to redeploy from scratch every time this happened (https://github.com/wazuh/wazuh/issues /22141).
Within the tests, the v4.8.0-beta1
tag branch was taken, which had the Dockerfile with the Ubuntu Jammy base image, all the changes up to beta2 were applied and new images were generated. With these images it was deployed in Kubernetes and the results were satisfactory. We tried changing the base image from v4.8.0-beta2
to amazonlinux:2
, when deploying we had the same problems with the filebeat.yml file but the pod did not restart, since AL2 did not have this problem with the processor. After this, an attempt was made to mount the filebeat.yml
file within the /etc/filebeat
directory, but when doing so it generated errors very similar to those obtained when mounting theossec.conf
file directly on the corresponding directory;
With ossec.conf
we implement a solution in Docker in which we mount the file in a temporary directory and at the start of the container we copy that file to the corresponding directory, so we do not have permissions problems. I took the docker branch of the v4.8.0
tag and created a function that did the same thing but for the Filebeat directory, so that, by mounting the file at the beginning of the container, it would copy the file into the /etc/filebeat
directory. This function worked correctly and by mounting a configmap in Kubernetes that pointed to the temporary directory, we were able to stop having add_kubernetes_metadata
error messages, since the filebeat.yml
file was stepping correctly, but we were having problems starting Filebeat and found that the file wazuh-template.yml
was not found either, which did not allow the Wazuh template to be pushed. When this file was missing, we found that the files that were inside the directory were exactly the ones left after the base installation of Filebeat, so that gave me the clue that we had a rollback or something was bringing the base files back.
The permanent_data execution process was removed from the Dockerfile to test if this process could be interfering. When deploying the images created with this change I found that the files within the /etc/filebeat
directory were not found. I tried deleting the /etc/filebeat
directory from the permanet_data.env
file and this had the same previous result, so I rebuilt images by eliminating the process of deleting the /var/ossec/data_tmp
directory, which is where the directories that are stored are located. included in the permanent_data process and when starting the pod I found that the files that were saved were from the moment after the installation but before adding the modified parameter files, so I checked the Dockerfile and indeed the permanent_data process was running before added our parameter files, so the execution order was changed, it was deployed with new images and we had a positive result.
Description
Related: https://github.com/wazuh/wazuh-docker/issues/1210
During the K8s testing of the mentioned issue, we are experiencing issues with the implementation of
add_kubernetes_metadata
in our Kubernetes (k8s) environment, specifically impacting our Wazuh manager pods.Upon investigation, it appears that the
add_kubernetes_metadata
function is triggering errors during deployment, leading to frequent pod restarts.The error message we are encountering is as follows:
It seems that the
add_kubernetes_metadata
function is being activated within Filebeat. We are unsure if this function has been intentionally activated by someone on the team or if it is a default setting. However, its activation is causing significant disruption to our deployment in Kubernetes, particularly with the Wazuh manager pods.