elastic / elastic-agent

Elastic Agent - single, unified way to add monitoring for logs, metrics, and other types of data to a host.
Other
124 stars 134 forks source link

Fleet installation script fails to detect error in service start #117

Open psanz-estc opened 3 years ago

psanz-estc commented 3 years ago

Description elastic-agent install fails to detect a problem in service start and report misleading message: Installation was successful and Elastic Agent is running, even though service hasn't been able to start (ie: due to a process already binded in port 6789)

Script should at least notify that agent was installed but there was a problem starting the service.

How to reproduce the bug

1) Process already running in localhost:6789. ie:

# netstat -natop | grep 6789
tcp6       0      0 :::6789                 :::*       LISTEN      1891/docker-proxy    off (0.00/0/0)

2) Run the elastic-agent command in CLI

ubuntu@server:~$ sudo ./elastic-agent install -f --kibana-url=https://<URL> --enrollment-token=<token>
The Elastic Agent is currently in BETA and should not be used in production

2020-12-03T16:43:31.069+0100    DEBUG   kibana/client.go:170    Request method: POST, path: /api/fleet/agents/enroll
Successfully enrolled the Elastic Agent.
Installation was successful and Elastic Agent is running.

Installation script reports Installation was successful and Elastic Agent is running. but agent is never enrolled in Kibana Fleet UI

3) Checking the output of journalctl -u elastic-agent.service we can see the process wasn't able to start due to the address already in use

#  journalctl -u elastic-agent.service
-- Logs begin at Sat 2020-08-29 18:15:02 CEST, end at Thu 2020-12-03 16:43:51 CET. --
nov 17 16:25:53 server systemd[1]: Stopped Elastic Agent is a unified agent to observe, monitor and protect your system..
nov 17 16:25:53 server systemd[1]: Started Elastic Agent is a unified agent to observe, monitor and protect your system..
nov 17 16:25:53 server elastic-agent[1514327]: starting GRPC listener: listen tcp 127.0.0.1:6789: bind: address already in use
nov 17 16:25:53 server systemd[1]: elastic-agent.service: Main process exited, code=exited, status=1/FAILURE
nov 17 16:25:53 server systemd[1]: elastic-agent.service: Failed with result 'exit-code'.
nov 17 16:27:53 server systemd[1]: elastic-agent.service: Scheduled restart job, restart counter is at 2.
nov 17 16:27:53 server systemd[1]: Stopped Elastic Agent is a unified agent to observe, monitor and protect your system..
nov 17 16:27:53 server systemd[1]: Started Elastic Agent is a unified agent to observe, monitor and protect your system..
nov 17 16:27:54 server elastic-agent[1514463]: starting GRPC listener: listen tcp 127.0.0.1:6789: bind: address already in use
nov 17 16:27:54 server systemd[1]: elastic-agent.service: Main process exited, code=exited, status=1/FAILURE
nov 17 16:27:54 server systemd[1]: elastic-agent.service: Failed with result 'exit-code'.
...

Workaround We can change the default port in /opt/Elastic/Agent/elastic-agent.yml from 6789 to ie 16789:

fleet:
  enabled: true
agent.grpc:
  address: localhost
  port: 16789

And then restart the service and check that service is up:

# sudo systemctl start elastic-agent.service
# 
# sudo  journalctl -u elastic-agent.service -f
-- Logs begin at Sat 2020-08-29 18:15:02 CEST. --
dic 03 16:53:20 server systemd[1]: Started Elastic Agent is a unified agent to observe, monitor and protect your system..
elasticmachine commented 3 years ago

Pinging @elastic/ingest-management (Team:Ingest Management)

elasticmachine commented 2 years ago

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)