OSSEC agent not connecting after Linux host restart

EHRETic commented 3 years ago

Hi there,

I've a consistent issue with several Linux hosts (one is Fedora 33 server and the others CentOS 8 server) I've no problem with any of my Windows hosts and my OSSEC server is Alienvault OSSIM. Agent (3.6.0) was installed as a service with this command:

sudo systemctl enable --now ossec-hids.service

Everytime a host is rebooted, I'll get the following error message and agent will appear disconnected on the server side (and will never try again):

2021/02/02 09:16:38 ossec-execd: INFO: Started (pid: 1045).
2021/02/02 09:16:38 ossec-agentd: INFO: Using notify time: 600 and max time to reconnect: 1800
2021/02/02 09:16:38 going daemon
2021/02/02 09:16:38 starting imsg stuff
2021/02/02 09:16:38 Creating socketpair()
2021/02/02 09:16:38 agentd imsg_init()
2021/02/02 09:16:38 os_dns imsg_init()
2021/02/02 09:16:38 ossec-agentd(1410): INFO: Reading authentication keys file.
2021/02/02 09:16:38 ossec-agentd: INFO: Assigning counter for agent myservername: '5111:700'.
2021/02/02 09:16:38 ossec-agentd: INFO: Assigning sender counter: 6:8034
2021/02/02 09:16:38 ossec-agentd: INFO: Started (pid: 1049).
2021/02/02 09:16:38 ossec-agentd: INFO: Server 1: 192.168.X.Y
2021/02/02 09:16:38 ossec-agentd: INFO: Trying to connect to server 192.168.X.Y, port 1514.
2021/02/02 09:16:38 ossec-agentd(1216): ERROR: Unable to connect to '192.168.X.Y'.
2021/02/02 09:16:38 ossec-logcollector: Remote commands are not accepted from the manager. Ignoring it on the agent.conf

But if I restart the service manually, it connects successfully and I get that in the logs:

2021/02/02 09:27:07 ossec-execd: INFO: Started (pid: 3518).
2021/02/02 09:27:07 going daemon
2021/02/02 09:27:07 ossec-logcollector: Remote commands are not accepted from the manager. Ignoring it on the agent.conf
2021/02/02 09:27:07 ossec-logcollector(1202): ERROR: Configuration error at '/var/ossec/etc/shared/agent.conf'. Exiting.
2021/02/02 09:27:07 starting imsg stuff
2021/02/02 09:27:07 Creating socketpair()
2021/02/02 09:27:07 agentd imsg_init()
2021/02/02 09:27:07 ossec-agentd(1410): INFO: Reading authentication keys file.
2021/02/02 09:27:07 ossec-agentd: INFO: Assigning counter for agent myservername: '5111:700'.
2021/02/02 09:27:07 ossec-agentd: INFO: Assigning sender counter: 6:8034
2021/02/02 09:27:07 ossec-agentd: INFO: Started (pid: 3522).
2021/02/02 09:27:07 ossec-agentd: INFO: Server 1: 192.168.X.Y
2021/02/02 09:27:07 ossec-agentd: INFO: Trying to connect to server 192.168.X.Y, port 1514.
2021/02/02 09:27:07 os_dns imsg_init()
2021/02/02 09:27:07 INFO: Connected to 192.168.149.18 at address 192.168.X.Y, port 1514
2021/02/02 09:27:07 ossec-agentd: DEBUG: agt->sock: 14
2021/02/02 09:27:07 ossec-syscheckd(1756): ERROR: Duplicated directory given: '/etc'.
2021/02/02 09:27:07 ossec-syscheckd(1756): ERROR: Duplicated directory given: '/bin'.
2021/02/02 09:27:08 ossec-agentd(4102): INFO: Connected to server 192.168.X.Y, port 1514.

Behavior is consitent on all hosts. My config file is the default as I'm new with Linux hosts monitoring so the only modification I made is the section that points to my server.

Debugging log level to 2 didn't show more info in the logs but now, I'm open to any help! Thanks in advance 😉

ddpbsd commented 3 years ago

Is your alienvault OSSIM using the same version of OSSEC? Check the /var/ossec/logs/ossec.log file on both the server and the agents for extra log messages. You might have to run the ossec-remoted process in debug mode.

EHRETic commented 3 years ago

Is your alienvault OSSIM using the same version of OSSEC? Check the /var/ossec/logs/ossec.log file on both the server and the agents for extra log messages. You might have to run the ossec-remoted process in debug mode.

No it doesn't, it seems to run 2.9.1 (used command ossec-analysisd -V) and it's "embeded meaning you can't really touch it I think. I also looked at the logs on the server side, but not really helping, I'll have to try debug too.

But from what I understand so far, there is nothing wrong on server side, as if I restart the service on the client once booted, everything works fine. It seems like the network is not "ready" when HIDS service starts or something like that.

EHRETic commented 3 years ago

Could not see anything on server side... Anyone? Any clue?

Is there any way to change dependencies to start the service later? I'd like to try that. 😉

bigtrucker89 commented 3 years ago

Cant say for sure if a 3.6.0 agent is supported by something as old as a 2.9.1 server, but assuming it would work, what happens when you restart the agent after the network is plumbed? If it works then, then youre right that the issue is the agent is starting before the network is available on the agent system.

EHRETic commented 3 years ago

Cant say for sure if a 3.6.0 agent is supported by something as old as a 2.9.1 server, but assuming it would work, what happens when you restart the agent after the network is plumbed? If it works then, then youre right that the issue is the agent is starting before the network is available on the agent system.

That is exactly what is happening, whenever I restart the service manually (because the service is started after reboot), it works straight away and I see new events right after in SIEM. That is also why I'm wondering how I can defer the service startup, which at this point would be an acceptable solution.

I would probably know if I was a Linux expert, but I'm not! 😁

ddpbsd commented 3 years ago

You can add something like ExecStartPre=/bin/sleep 30 to the [Service] section of the appropriate systemd service files. But ossec uses some target file to pull them all together and I don't see a way to add a delay to this file. Did you run ossec-remoted in debug mode (/var/ossec/bin/ossec-remoted -d)? It might provide more information. Have you been able to upgrade it to the same version you're running on your agents?

bigtrucker89 commented 3 years ago

So in nearly every case the network should already be plumbed before the ossec agent is started. Are you using a GUI desktop with your Linux system?

If so, on many distros where the Desktop GUI is enabled, it doesnt start the network until after you log into the GUI, and Linux/UNIX normally starts the network way before that. Services, like the ossec agent, assume the network is started on boot (which is true 99% of the time, otherwise servers wouldnt work very well), so if case thats your case, thats your problem, configure the network to start on boot and not on login.

Otherwise, somehow your system has been configured to start the network either after services like ossec are started, or somehoe the ossec agent has been configured to start way too early. In which case, Dans solution should work, but Id check the start order of your services if your network is start to be plumbed on boot.

EHRETic commented 3 years ago

So in nearly every case the network should already be plumbed before the ossec agent is started. Are you using a GUI desktop with your Linux system?

Nope, am I'm a purist Linux N0ob! I use either Fedora server or Centos 8 server without a gui! 😁 (well I'm cheating a little: I use Cockpit, but all admin is still done with command line)

@ddpbsd I'll check your solution tomorrow, thanks for the help! 😉

EHRETic commented 3 years ago

You can add something like ExecStartPre=/bin/sleep 30 to the [Service] section of the appropriate systemd service files. But ossec uses some target file to pull them all together and I don't see a way to add a delay to this file.

Well, this was definitivelly a part of the solution, but "the last bit"... 😁

It has definitivelly something to do on how the service is installed. By default, the service "configuration" file will be located in /run/systemd/generator.late folder.

In this location, something is handled automatically and you just can't edit the file to change the behaviour. It also starts very early in the boot process (read a lot of confusing things)

So what I've tried, I moved this file to the default location (/usr/lib/systemd/system/) and edited the file to add the ExecStartPre line and an [Install] section that is necessary to install the service.

ossec-hids.service file looks now like this:

[Unit]
Documentation=man:systemd-sysv-generator(8)
SourcePath=/etc/rc.d/init.d/ossec-hids
Description=SYSV: OSSEC-HIDS is an Open Source Host-based Intrusion Detection System.
Before=multi-user.target
Before=multi-user.target
Before=multi-user.target
Before=graphical.target

[Service]
Type=forking
Restart=no
TimeoutSec=5min
IgnoreSIGPIPE=no
KillMode=process
GuessMainPID=no
RemainAfterExit=yes
ExecStartPre=/bin/sleep 30
ExecStart=/etc/rc.d/init.d/ossec-hids start
ExecStop=/etc/rc.d/init.d/ossec-hids stop
ExecReload=/etc/rc.d/init.d/ossec-hids reload

[Install]
WantedBy=multi-user.target

With this, service starts automatically with a delay and this helps because it connects to the server without the need of a manual restart.

So now, how do we fix that "forever"?

PS: I repeat, I'm not a Linux specialist and I hope this doesn't open a security pandora box... 😉

atomicturtle commented 3 years ago

Wouldnt something like this work?

After=network.target

EHRETic commented 3 years ago

Wouldnt something like this work?

After=network.target

I'll try that 😉 It might be cleaner than waiting.

EHRETic commented 3 years ago

Just FYI, when service is installed "by default" (with enablement), it looks like this in cockpit (note the "static" in status):

When I move the service file to the systemd folder, it looks like this:

EHRETic commented 3 years ago

After=network.target

So unfortunatelly, this didn't work, still seems to be too early in the boot process.

The following configuration seems to work... I kept the Before (removed the duplicates though) and added the same After and Wants dependencies as NGINX (I use it a lot):

[Unit]
Documentation=man:systemd-sysv-generator(8)
SourcePath=/etc/rc.d/init.d/ossec-hids
Description=SYSV: OSSEC-HIDS is an Open Source Host-based Intrusion Detection System.
Before=multi-user.target
Before=graphical.target
After=network-online.target remote-fs.target nss-lookup.target
Wants=network-online.target

[Service]
Type=forking
Restart=no
TimeoutSec=5min
IgnoreSIGPIPE=no
KillMode=process
GuessMainPID=no
RemainAfterExit=yes
ExecStart=/etc/rc.d/init.d/ossec-hids start
ExecStop=/etc/rc.d/init.d/ossec-hids stop
ExecReload=/etc/rc.d/init.d/ossec-hids reload

[Install]
WantedBy=multi-user.target

ossec / ossec-hids

OSSEC agent not connecting after Linux host restart #1946