DataDog / datadog-process-agent

Datadog Process Agent
https://datadoghq.com
20 stars 9 forks source link

5.17 Segfault - invalid memory address or nil pointer dereference in Docker #51

Closed 13rac1 closed 6 years ago

13rac1 commented 7 years ago

Just attempted an upgrade to 5.17 from 5.16. We are installing/running dd-agent with Ansible. The agent task keeps reporting changed.

TASK: [datadog-agent | Ensure datadog-agent is running] *********************** 
changed: [server-name]

Check the service:

$ sudo service datadog-agent status
datadog-agent:collector          RUNNING   pid 8444, uptime 2:01:02
datadog-agent:dogstatsd          RUNNING   pid 8434, uptime 2:01:02
datadog-agent:forwarder          RUNNING   pid 8433, uptime 2:01:02
datadog-agent:go-metro           EXITED    Aug 29 05:09 PM
datadog-agent:jmxfetch           EXITED    Aug 29 05:09 PM
datadog-agent:process-agent      FATAL     Exited too quickly (process log may have details)
datadog-agent:trace-agent        RUNNING   pid 8432, uptime 2:01:02
Datadog Agent (supervisor) is NOT running all child processes

Run process-agent manually:

$ /opt/datadog-agent/bin/process-agent
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x7b247b]

goroutine 58 [running]:
github.com/DataDog/datadog-process-agent/util/docker.findDockerNetworks(0xc4201360c0, 0x40, 0x2070, 0x0, 0x40, 0xc420082dc0, 0xc42041c3c0)
    /home/jenkins/workspace/process-agent-build-ddagent/go/src/github.com/DataDog/datadog-process-agent/util/docker/docker.go:541 +0xbb
github.com/DataDog/datadog-process-agent/util/docker.(*dockerUtil).dockerContainers(0xc4203b0080, 0xae, 0x100, 0xc4203f8d80, 0x0, 0x0)
    /home/jenkins/workspace/process-agent-build-ddagent/go/src/github.com/DataDog/datadog-process-agent/util/docker/docker.go:329 +0x7bf
github.com/DataDog/datadog-process-agent/util/docker.(*dockerUtil).containers(0xc4203b0080, 0x9, 0x4072efae147ae148, 0x4055b51eb851eb85, 0x40f1296b33333333, 0x4007c28f5c28f5c3)
    /home/jenkins/workspace/process-agent-build-ddagent/go/src/github.com/DataDog/datadog-process-agent/util/docker/docker.go:381 +0x99b
github.com/DataDog/datadog-process-agent/util/docker.AllContainers(0xed137b600, 0xc420492000, 0x1, 0x1, 0x0)
    /home/jenkins/workspace/process-agent-build-ddagent/go/src/github.com/DataDog/datadog-process-agent/util/docker/docker.go:220 +0x7c
github.com/DataDog/datadog-process-agent/checks.(*ContainerCheck).Run(0x1709aa0, 0xc42039e160, 0x4d658222, 0x0, 0x0, 0x0, 0x0, 0x0)
    /home/jenkins/workspace/process-agent-build-ddagent/go/src/github.com/DataDog/datadog-process-agent/checks/container.go:50 +0x96
main.(*Collector).runCheck(0xc42040c200, 0x16cec00, 0x1709aa0)
    /home/jenkins/workspace/process-agent-build-ddagent/go/src/github.com/DataDog/datadog-process-agent/agent/collector.go:97 +0x6a
main.(*Collector).run.func2(0xc42040c200, 0xc420416900, 0x16cec00, 0x1709aa0)
    /home/jenkins/workspace/process-agent-build-ddagent/go/src/github.com/DataDog/datadog-process-agent/agent/collector.go:130 +0x347
created by main.(*Collector).run
    /home/jenkins/workspace/process-agent-build-ddagent/go/src/github.com/DataDog/datadog-process-agent/agent/collector.go:153 +0x1ac
conorbranagan commented 7 years ago

@eosrei It looks like this might be due to the version of docker you're running where the network settings are unavailable.

As a short-term workaround can you try disabling network collection altogether? This can be done by adding this to your datadog.conf:

[process.config]
collect_docker_network = false

Or setting the environment DD_COLLECT_DOCKER_NETWORK=false.

If you have a chance can you pass along your Docker version so we can verify our fix?

conorbranagan commented 6 years ago

@eosrei Are you still seeing this issue in 5.17.2?

13rac1 commented 6 years ago

I'll do a new test, Thank you!

13rac1 commented 6 years ago

@conorbranagan It's working correctly now. That legacy system is running Docker 1.9. I've updated it and all servers are reporting to DataDog correctly. Thank you!