Closed jmv74211 closed 2 years ago
Working on this issue.
PUT /agents/reconnect
endpoint to analyze results.It has been observed that the call to the PUT /agents/reconnect endpoint
works as expected. The TCP connection established between the wazuh-agent
and the wazuh-manager
is restarted without having to restart the wazuh-agent
service. If the wazuh-agent
is connected to a load balancer, this serves to allow the wazuh-agent
to connect to another worker
of the wazuh-cluster
.
It has also been verified that restarting the connection with the same wazuh-manager
does not cause any kind of failure, since after that restart the wazuh-manager
continues to report wazuh-agent
alerts.
PUT /agents/reconnect
endpoint works to force a wazuh-agent
or several (specified as list in the request parameters) to restart connection.wazuh-agent
service is not restarted.wazuh-agent
is connected to a load balancer and has no persistent connection configuration, the wazuh-agent
can connect to another node in the cluster.wazuh-agent
is connected to a wazuh-manager
and the connection is restarted, the wazuh-agent
continues to report without problems and the wazuh-manager
generates the wazuh-agent
alerts.First, I will deploy an environment with the following features:
Installation | Version | Package date |
---|---|---|
Packages | v4.3.0-rc5 | March 29 |
Cluster nodes
NAME TYPE VERSION ADDRESS
master master 4.3.0 172.16.1.40
worker-2 worker 4.3.0 172.16.1.42
worker-1 worker 4.3.0 172.16.1.41
worker-3 worker 4.3.0 172.16.1.43
Agent connections
ID NAME IP STATUS VERSION NODE NAME
000 wazuh-master 127.0.0.1 active Wazuh v4.3.0 master
001 wazuh-agent-1 10.0.2.15 active Wazuh v4.3.0 worker-1
002 wazuh-agent-5 10.0.2.15 active Wazuh v4.3.0 worker-1
003 wazuh-agent-3 10.0.2.15 active Wazuh v4.3.0 worker-3
004 wazuh-agent-2 10.0.2.15 active Wazuh v4.3.0 worker-3
005 wazuh-agent-4 10.0.2.15 active Wazuh v4.3.0 worker-1
Note: It seems that for whatever reason they have not connected to the worker-2. We will check this when we reconnect using the API endpoints.
NGINX configuration
load_module /usr/lib/nginx/modules/ngx_stream_module.so;
events {}
stream {
upstream master {
server 172.16.1.40:1515;
}
upstream mycluster {
hash $remote_addr consistent;
server 172.16.1.41:1514;
server 172.16.1.42:1514;
server 172.16.1.43:1514;
}
server {
listen 1515;
proxy_pass master;
}
server {
listen 1514;
proxy_pass mycluster;
}
}
PUT /agents/reconnect
endpoint to analyze results.Obtaining authentication token (using default credentials)
TOKEN=$(curl -u wazuh:wazuh -k -X GET "https://172.16.1.40:55000/security/user/authenticate?raw=true")
The current connection of wazuh-agent-1
is with worker-1
.
001 wazuh-agent-1 10.0.2.15 active Wazuh v4.3.0 worker-1
After making the reconnection request:
curl -X PUT https://172.16.1.40:55000/agents/reconnect?agents_list=001 -H "Authorization: Bearer $TOKEN"
The log of wazuh-agent-1
shows that it has been reconnected:
2022/04/12 10:36:46 wazuh-agentd: INFO: Wazuh Agent will be reconnected because a reconnect message was received
2022/04/12 10:36:46 wazuh-agentd: INFO: Closing connection to server (172.16.1.50:1514/tcp).
2022/04/12 10:36:46 wazuh-agentd: INFO: Trying to connect to server (172.16.1.50:1514/tcp).
2022/04/12 10:36:46 wazuh-agentd: INFO: (4102): Connected to the server (172.16.1.50:1514/tcp).
And when checking which worker the wazuh-agent-1
has connected to, we see how it has connected to it:
001 wazuh-agent-1 10.0.2.15 active Wazuh v4.3.0 worker-1
This is because the NGINX configuration has applied the hash $remote_addr consistent;
directive that makes connections persistent.
After commenting this directive, restarting the NGINX
service, and calling the endpoint again to force the reconnection of wazuh-agent-1
, we see how it has connected to a new worker, in this case to worker-3
:
001 wazuh-agent-1 10.0.2.15 active Wazuh v4.3.0 worker-3
If we force another reconnection, we see that in this case it has changed to worker-2
:
001 wazuh-agent-1 10.0.2.15 active Wazuh v4.3.0 worker-2
I have configured a wazuh-agent
to always report to the same wazuh-manager
.
I then applied the following syscheck configuration to monitor a directory on the wazuh-agent
host:
<directories realtime="yes">/var/log/test</directories>
After that, I forced the connection to restart.
curl -X PUT -k https://172.16.1.40:55000/agents/reconnect?agents_list=001 -H "Authorization: Bearer $TOKEN"
I have generated a new file to force a new alert:
echo "test" >> /var/log/test/a.txt
It has been observed how the alert has been generated correctly in the wazuh-manager
.
** Alert 1649770780.2104758: - ossec,syscheck,syscheck_entry_added,syscheck_file,pci_dss_11.5,gpg13_4.11,gdpr_II_5.1.f,hipaa_164.312.c.1,hipaa_164.312.c.2,nist_800_53_SI.7,tsc_PI1.4,tsc_PI1.5,tsc_CC6.1,tsc_CC6.8,tsc_CC7.2,tsc_CC7.3,
2022 Apr 12 13:39:40 (wazuh-agent-1) any->syscheck
Rule: 554 (level 5) -> 'File added to the system.'
File '/var/log/test/a.txt' added
Mode: realtime
Attributes:
- Size: 5
- Permissions: rw-r--r--
- Date: Tue Apr 12 13:39:41 2022
- Inode: 1049535
- User: root (0)
- Group: root (0)
- MD5: d8e8fca2dc0f896fd7cb4cb0031ba249
- SHA1: 4e1243bd22c66e76c2ba9eddc1f91394e57f9f83
- SHA256: f2ca1bb6c7e907d06dafe4687e579fce76b37e4e93b7605022da52e6ccc26fd2
We want to perform a manual test on the development related to API request to balance Agents (restart TCP session).
The objective is to test the correct functioning and behavior of Wazuh after using the
PUT /agents/reconnect
API endpoint to force the reconnection of thewazuh-agent
to thewazuh-manager
.The background of all this is to be able to restart the TCP connection established between the
wazuh-agent
andwazuh-manager
without restarting thewazuh-agent
service itself (which would imply restarting the daemons, proving new unwanted scans) so that if a load balancer is used, it can redirect thatwazuh-agent
to another possiblewazuh-manager
node in the cluster that has less load.Reference issue: https://github.com/wazuh/wazuh/issues/7896