DataDog / datadog-agent

Main repository for Datadog Agent
https://docs.datadoghq.com/
Apache License 2.0
2.86k stars 1.2k forks source link

Agent conflicts with Docker Registry on port 5000 #4229

Open alexeyzimarev opened 5 years ago

alexeyzimarev commented 5 years ago

Describe what happened:

After installing the agent on our GitLab server, we encountered issues with GitLab Container Registry. The issue didn't appear immediately after the agent was installed but after the scheduled server update follower by the restart.

Suddenly, all attempts to push to the container registry started to fail with 404 error. It significantly disturbed the work of our development team, because it was working and out of the blue stopped working.

After checking GitLab omnibus logs, I found out that the Docker registry keeps restarting, reporting that 127.0.0.1:5000 is already in use. Then I found out that the agent is listening on that port.

I had to change the agent settings so it doesn't use port 5000 and I don't really know what consequences it would have on the agent itself.

Describe what you expected:

Since GitLab integration is one of the standard integrations of Datadog, I was not expecting that it will come in direct conflict of the widely used GitLab component, such as the container registry.

Steps to reproduce the issue:

Install GitLab using Omnibus and enable the container registry. Check if everything works as expected. Install the Datadog agent on the same machine, reboot the machine and observe that the GHitLab container registry keeps restarting because port 5000 is already in use.

Additional environment details (Operating System, Cloud provider, etc):

Ubuntu 18.04

OneCricketeer commented 4 years ago

That is the go-expvar server port. It can be changed without conflict

https://docs.datadoghq.com/integrations/go_expvar/

olivielpeau commented 4 years ago

Indeed, the port (default: 5000) on which the Agent listens locally can be changed with the expvar_port option of the Agent config (see https://github.com/DataDog/datadog-agent/blob/07f5937dcf6d21dd938ba4683332adc5677e92f0/pkg/config/config_template.yaml#L213-L216). If it's set to a different value than its default, the Agent will continue to work, all the parts of the Agent that use this port will pick up change.

The only exception is if you've set up the go_expvar Agent integration to monitor the Agent itself (on localhost:5000). If you change the expvar port of the Agent, and want the go_expvar check to continue monitoring the Agent, you'll have to update the port it connects to accordingly, as documented in https://docs.datadoghq.com/integrations/go_expvar/ (and mentioned by @cricket007 ).

We are aware that there is a chance that port 5000 is used by other processes on some hosts, so we're considering changing the Agent's expvar port to a different value (less likely to conflict with other processes) in a future version of the Agent. We'll keep this card updated when that happens.

colinmollenhour commented 4 years ago

Port 5000 also conflicts with GitLab omnibus container.

pablocompagni-contractorvp commented 3 years ago

I'm running into this issue with an ECS Fargate container, does anyone know which env var is able to change the DD agent port from 5000 to something else? thanks

sgnn7 commented 3 years ago

@pablocompagni-contractorvp You can see it in one of the earlier posts but the setting is expvar_port in datadog.yaml. Alternatively, you may be able to use DD_EXPVAR_PORT env var to set this (though I did not test this).

archy-bold commented 3 years ago

Can confirm setting DD_EXPVAR_PORT works for me.

MGough commented 2 years ago

I encountered this issue and it took a while to work out what was going on. I was following this documentation: https://docs.datadoghq.com/integrations/ecs_fargate/?tab=fluentbitandfirelens

The issues were:

Mentioning port 5000 in those docs would've saved me a lot of pain!

Setting DD_EXPVAR_PORT worked for me too, although my first attempt 5001 proved to be already in use.

boomshadow commented 2 years ago

This was quite the unexpected thing to wake up to this morning. We did a GitLab scheduled maintenance reboot overnight, and then woke up to all our CI pipelines failing. It was a mad scramble for a few hours to figure out that our DD agent was port binding before the Docker registry :(

Thanks for the information here! I was able to get this resolved with your help. I do hope that Datadog considers changing the default port to something far less common.

A handy trick is to set the expvar_port to 0 (zero). The kernel will give it a random, guaranteed-to-be unassigned port.

htims1989 commented 2 years ago

Just run into this following: https://www.datadoghq.com/blog/deploy-dotnet-core-aws-fargate/

Especially awkward since ASP.NET Core containers default to port 5000!

willogden commented 2 years ago

Also ran into this. For those also needing port 5001, this is used by the agent CLI service and can be changed using the DD_CMD_PORT environment variable.

baarney commented 11 months ago

Also conflicts with default nginx config on AWS Elastic Beanstalks.

By default, Elastic Beanstalk configures the proxy to forward requests to your application on port 5000.

https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/nodejs-platform-proxy.html

sgnn7 commented 11 months ago

Hi @baarney (and others in the thread), As mentioned in our older comment on it, we are aware that there is a chance that port 5000 is used by other processes on some hosts and we are still considering changing the Agent's default expvar port to a different value in a future version of the Agent. For context some of the bigger concerns here around changing that value are around ensuring that users of expvar/go_expvar functionality do not get unexpected changes that break their currently-working workflows.

rasatlas commented 8 months ago

On ubuntu 20.04 got to /etc/datadog-agent/datadog.yaml file and uncomment the line containing expvar_port and change its value to a port you don't use.

expvar_port: 6062

Before everything stopping datadog-agent service and restarting it after the value change is needed.

This has solved my issue.