DataDog / datadog-agent

Main repository for Datadog Agent
https://docs.datadoghq.com/
Apache License 2.0
2.89k stars 1.21k forks source link

Datadog-agent v7 is not deployed as a service on CentOS 6 #5173

Open l1x opened 4 years ago

l1x commented 4 years ago

Output of the info page (if this is a bug)

Error: Could not start Service[datadog-agent]: Execution of '/sbin/service datadog-agent start' returned 1: datadog-agent: unrecognized service
Error: /Stage[main]/Datadog_agent::Redhat/Service[datadog-agent]/ensure: change from 'stopped' to 'running' failed: Could not start Service[datadog-agent]: Execution of '/sbin/service datadog-agent start' returned 1: datadog-agent: unrecognized service

Describe what happened:

After yum installing datadog-agent it is not recognized as a service and cannot be started.

https://s3.amazonaws.com/yum.datadoghq.com/stable/7/x86_64/datadog-agent-7.18.0-1.x86_64.rpm

Describe what you expected:

Properly starting agent.

Steps to reproduce the issue:

yum install https://s3.amazonaws.com/yum.datadoghq.com/stable/7/x86_64/datadog-agent-7.18.0-1.x86_64.rpm && service datadog-agent restart 

Additional environment details (Operating System, Cloud provider, etc):

CentOS 6

ogaca-dd commented 4 years ago

Hey @l1x ,

I reproduced your error but can you try using sudo restart datadog-agent instead of service datadog-agent restart?

l1x commented 4 years ago

@ogaca-dd you do not have an init script anymore and the only start/stop (service related) content that you have in the package is systemd service files. CentOS 6 does not have systemd and you do not provide any alternative system management service file as far as I can see. Is it not the case?

KSerrania commented 4 years ago

Hi @l1x,

For Agent 6 and 7 on CentOS, we support systemd and upstart (the latter is available and installed by default on CentOS 6). The upstart job definition files are installed in /etc/init:

$ rpm -ql datadog-agent | grep /etc/init/datadog-agent
/etc/init/datadog-agent-process.conf
/etc/init/datadog-agent-sysprobe.conf
/etc/init/datadog-agent-trace.conf
/etc/init/datadog-agent.conf

To start/stop/restart the Agent service on CentOS 6, you thus need to run sudo start/stop/restart datadog-agent, or alternatively sudo initctl start/stop/restart datadog-agent, instead of using the service command.

l1x commented 4 years ago
$> start datadog-agent
restart: Unable to connect to Upstart: Failed to connect to socket /com/ubuntu/upstart: Connection refused
KSerrania commented 4 years ago

Hi @l1x,

Could you give us some more details about your environment, to help us reproduce your issue? In particular:

Thanks!

jmcnatt commented 4 years ago

I am also seeing this issue in my environment.

initctl is able to successfully start/stop the agent process, but the dataodg-datadog_agent Puppet module relies on using service and chkconfig to manage the service.

Info: /Stage[main]/Datadog_agent/File[/etc/datadog-agent/datadog.yaml]: Scheduling refresh of Service[datadog-agent]
Info: /Stage[main]/Datadog_agent/File[/etc/datadog-agent/datadog.yaml]: Scheduling refresh of Service[datadog-agent]
Error: Could not enable datadog-agent: Execution of '/sbin/chkconfig --add datadog-agent' returned 1: error reading information on service datadog-agent: No such file or directory
Error: /Stage[main]/Datadog_agent::Redhat/Service[datadog-agent]/enable: change from 'false' to 'true' failed: Could not enable datadog-agent: Execution of '/sbin/chkconfig --add datadog-agent' returned 1: error reading
information on service datadog-agent: No such file or directory
Error: /Stage[main]/Datadog_agent::Redhat/Service[datadog-agent]: Failed to call refresh: Could not stop Service[datadog-agent]: Execution of '/sbin/service datadog-agent stop' returned 1: datadog-agent: unrecognized service
Error: /Stage[main]/Datadog_agent::Redhat/Service[datadog-agent]: Could not stop Service[datadog-agent]: Execution of '/sbin/service datadog-agent stop' returned 1: datadog-agent: unrecognized service
Info: Class[Datadog_agent::Redhat]: Unscheduling all events on Class[Datadog_agent::Redhat]
sadcomusa1 commented 4 years ago

I am getting a similar problem on CentOS 6.10 After the datadog-agent (version 6 or 7) is installed, it launches datadog-agent-sysprobe stop/pre-start, process that never stops.

The datadog-agent service is not listed by chkconfig --list command When I run sudo start datadog-agent it tells me "job is already running: datadog-agent"...

I have to use a 2nd terminal session to that server in order to run sudo stop datadog-agent from running that sysprobe. As soon as I start it, sysprobe continues...

That doesn't seem right. Please, help.

Slava

KSerrania commented 4 years ago

Hi @jmcnatt,

Thanks for the report!

I see you're using puppet on a CentOS-like 6, have you tried setting the service provider to upstart:

class{ 'datadog_agent':
  service_provider => 'upstart'
}

as mentioned in the puppet module's installation documentation?

KSerrania commented 4 years ago

Hi @sadcomusa1,

Thanks for the report!

When the Agent is installed, four Agent services are defined: datadog-agent, datadog-agent-process, datadog-agent-trace and datadog-agent-sysprobe. These four are upstart services, and won't show up in chkconfig --list (you can do initctl list to see them instead).

The datadog-agent service is the main one, and will start the other three processes when started.

Thus, in a standard setting, the datadog-agent service is started, which in turn starts the three other services. datadog-agent-process and datadog-agent-trace will then be running, but datadog-agent-sysprobe will be stopped in the pre-start section if the /etc/datadog-agent/system-probe.yaml configuration file does not exist (see the logic here).

In your case, the status datadog-agent-sysprobe stop/pre-start indicates that the sysprobe service is indeed stopped. If it was running, you should see a system-probe process in your process list. Could you give the output of ps aux | grep "system-probe" to check if that is indeed the case?

sadcomusa1 commented 4 years ago

Hello KylianI have attached a screenshot that show the output of ps aux command.You can see that I have 2 connections to that server - one via console in VMWare and another one via SSh session. In VMware console view I see the command running even though I am not logged into it.Slava

-------- Original Message -------- Subject: Re: [DataDog/datadog-agent] Datadog-agent v7 is not deployed as a service on CentOS 6 (#5173) From: Kylian Serrania notifications@github.com Date: Tue, June 23, 2020 11:30 am To: DataDog/datadog-agent datadog-agent@noreply.github.com Cc: sadcomusa1 happy@slava.us, Mention mention@noreply.github.com

Hi @sadcomusa1, Thanks for the report! When the Agent is installed, four Agent services are defined: datadog-agent, datadog-agent-process, datadog-agent-trace and datadog-agent-sysprobe. These four are upstart services, and won't show up in chkconfig --list (you can do initctl list to see them instead). The datadog-agent service is the main one, and will start the other three processes when started. Thus, in a standard setting, the datadog-agent service is started, which in turn starts the three other services. datadog-agent-process and datadog-agent-trace will then be running, but datadog-agent-sysprobe will be stopped in the pre-start section if the /etc/datadog-agent/system-probe.yaml configuration file does not exist (see the logic here). In your case, the status datadog-agent-sysprobe stop/pre-start indicates that the sysprobe service is indeed stopped. If it was running, you should see a system-probe process in your process list. Could you give the output of ps aux | grep "system-probe" to check if that is indeed the case? —You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or unsubscribe.

sadcomusa1 commented 4 years ago

@KSerrania

Just in case, I have not sent the right output:

system-probe

sadcomusa1 commented 4 years ago

And this process/job runs non-stop, even shows in the console with no one logged in (this is CentOS 6.10)

sysprobe-never-stops

jmcnatt commented 4 years ago

Hey @KSerrania - I just wanted to confirm that adding service_provider => 'upstart' resolved the issue. In my case, I am using hiera and added datadog_agent::service_provider: 'upstart'