DataDog / datadog-agent

Main repository for Datadog Agent
https://docs.datadoghq.com/
Apache License 2.0
2.83k stars 1.19k forks source link

[BUG] Agent does not log anything when it fails to start due to configuration error #23009

Open timmc-edx opened 7 months ago

timmc-edx commented 7 months ago

Agent Environment 7.51.0 on Ubuntu in a Docker container

Describe what happened: I made a syntax error in /etc/datadog-agent/datadog.yaml (accidentally typed . on the first line, which was otherwise blank) and couldn't figure out why the agent was failing to start/

Describe what you expected: I would expect there to be error logs in /var/log/datadog/agent.log indicating a config error, but there were no new logs there at all.

sgnn7 commented 7 months ago

Hi @timmc-edx,

Were your host's service logs, agent configcheck, and agent diagnose not adequate in troubleshooting this?

PS: More-directly to your suggestion, since log_file itself is configured within that yaml itself it's a bit of a catch-22 to implement such a feature without some tricky considerations.

timmc-edx commented 7 months ago

I didn't try those, but it does appear that they would have helped. Maybe one or both of those could be listed on the agent troubleshooting page? (That's where I looked for guidance.)

timmc-edx commented 7 months ago

(Oddly enough, I can't find any system log files that cover service start/stop.)

sgnn7 commented 7 months ago

@timmc-edx I'll take a look at the troubleshooting page to see if we can improve it. As for service log files, depending on the platform (I'm assuming Linux and VM/host install) you should be able to do: sudo journalctl -u datadog-agent -f

timmc-edx commented 7 months ago

Thanks!

But yeah, there's no journalctl on this system -- I'm using the documented service command to start the agent, rather than systemd (which doesn't seem to be available here.)

claco commented 4 months ago

Same problem-ish just this morning. Fresh install, and all was well setting:

dogstatsd_non_local_traffic: true

Then i set apm_non_local_traffic: true, and I get a failure to start the service.

datadog-agent diagnose just says:

Error: unable to load Datadog config file: While parsing config: yaml: line 1285: did not find expected key

That would be the line for apm_non_local_traffic. I do get the same message in journalctl for that service as well, but there is indeed, nothing in /var/log/datadog/agent.log

In this case, I get the error now, which really means, "the indent level is wrong, make sure to uncomment the # apm_config: line too", but yea, I had to dig a bit to get to the error.