DataDog / dd-trace-js

JavaScript APM Tracer
https://docs.datadoghq.com/tracing/
Other
645 stars 305 forks source link

Add support for Unix Domain Sockets to DogStatsD client #993

Open maxenglander opened 4 years ago

maxenglander commented 4 years ago

The DogStatsD client does not appear to support Unix Domain Sockets.

It would be great if this client could be extended to support Unix Domain Sockets.

Alternately, it would be good if dd-trace-js allowed the user to supply a custom DogStatsD client. Applications may use both dd-trace-js alongside a DogStatsD client such as hot-shots. Rather than have to configure hot-shots to use UDS, and to also configure dd-trace-js internal DogStatsD client to use UDS, it would be nice if we could just do:

DogStatsD = require('hot-shots');
dogstatsd = new DogStatsD({ protocol: 'uds' });
tracer = require('dd-trace-js').init({ dogstatsd });
rochdev commented 4 years ago

DogStatsD is an implementation detail of the tracer, so unfortunately it's not really possible to accept a custom client. This is especially true since we plan to switch from DogStatsD to using the existing endpoint on the agent. The good news is that this will not only improve the onboarding experience but it will also mean that UDS will just work.

maxenglander commented 4 years ago

Thanks @rochdev! If there are plans to switch from DogStatsD to the existing agent endpoint, should this ticket just be closed? Or is there any work that could be done with this ticket in the mean time?

maxenglander commented 4 years ago

At a minimum, it might be good to allow for dogstatsd.host to be an API option for init(), so that users can take advantage of the socat workaround suggested in DD docs.

rochdev commented 4 years ago

If there are plans to switch from DogStatsD to the existing agent endpoint, should this ticket just be closed?

I think it should stay open since the issue is not solved.

At a minimum, it might be good to allow for dogstatsd.host to be an API option for init(), so that users can take advantage of the socat workaround suggested in DD docs.

That makes sense. Since there is already an option for the port, adding an option for the host as well would definitely work as a workaround until we get to the endpoint change.

maxenglander commented 4 years ago

Thanks @rochdev I'll submit a PR to make specifying DogStatsD host and option.

Do you have a rough idea for timeline for being able to send APM metrics through trace endpoint? Is that something that would/could be handled just with changes to dd-trace-js and also changes to the Golang datadog-agent, or does it also require changes to DataDog backend components?

rochdev commented 4 years ago

Do you have a rough idea for timeline for being able to send APM metrics through trace endpoint?

No timeline right now unfortunately.

Is that something that would/could be handled just with changes to dd-trace-js and also changes to the Golang datadog-agent, or does it also require changes to DataDog backend components?

This should only require changes in the tracer and the agent. We have already started experimenting in branches but nothing production-ready yet.

ghost commented 4 years ago

I see that there is platform folder with nodejs support for metrics. Would be nice if it was exposed so we can send out custom gauge-s to DD

rochdev commented 4 years ago

@dusan-dragon We implemented custom support internally to have more control on our specific use case for runtime metrics, but for any other custom metrics you can use any DogStatsD library such as hot-shots.

ghost commented 2 years ago

I think consumer lag for kafka is pretty standard metric. Currently you have plugin for kafkajs but you do not support consumer lag metric

janeklb commented 1 year ago

Hi @rochdev - could you please provide an update on this ticket (ie. will any of the current efforts you're working on solve this?).

janeklb commented 1 year ago

At a minimum, it might be good to allow for dogstatsd.host to be an API option for init(), so that users can take advantage of the socat workaround suggested in DD docs.

@maxenglander out of curiosity, how are you running socat / how is it helping you?

rochdev commented 1 year ago

@janeklb If you configure dd-trace to use UDS, it will now also use it for runtime metrics as long as you have a recent enough version of both the library and the agent.

janeklb commented 1 year ago

if you configure dd-trace to use UDS,

I've got dd-trace configured to use UDS for tracing (using DD_TRACE_AGENT_URL=unix:///var/run/datadog/apm.socket), but when I use that I don't get runtime metrics reported. Should they be?. Is there something else that needs to be set? Afaiu statsd datagrams should be sent to the dsd.socket (not apm.socket), but I don't see how that could be configured. The dogstatsd client https://github.com/DataDog/dd-trace-js/blob/master/packages/dd-trace/src/dogstatsd.js#L13-L31 doesn't seem to have any references to UDS.

In this discussion thread https://github.com/DataDog/dd-trace-js/discussions/2470#discussioncomment-3973852 you mentioned

This change allows us to support additional features like reporting metrics over UDS.

But reading through the PR i don't see any mention of UDS either

FWIW the agent is running as a daemonset in an EKS cluster, using the image gcr.io/datadoghq/agent:7-jmx

rochdev commented 1 year ago

but when I use that I don't get runtime metrics reported

Was it working before without UDS? Or is this the first time you setup the feature? It's still necessary to follow the steps in the setup docs for both the library and the agent before runtime metrics can be sent.

But reading through the PR i don't see any mention of UDS either

The PR adds support for reporting runtime metrics through HTTP with the existing HTTP client in the library, which already supports sending HTTP through UDS, so it doesn't explicitly add support for UDS.

FWIW the agent is running as a daemonset in an EKS cluster, using the image gcr.io/datadoghq/agent:7-jmx

Can you confirm the exact version that is installed? It should be at a minimum 7.40.

janeklb commented 1 year ago

Was it working before without UDS?

Yes, it was working fine before (we had DD_AGENT_HOST configured correctly, and the agent was listening on 8125/udp and 8126)

The PR adds support for reporting runtime metrics through HTTP with the existing HTTP client in the library, which already supports sending HTTP through UDS, so it doesn't explicitly add support for UDS.

thanks for clarifying

Can you confirm the exact version that is installed? It should be at a minimum 7.40.

I can't say for sure -- the image hash is 8d4cf9a5bd7e1520170904081279eb5a2bf1bfd1caabb04426e3017c06e45461 - docker hub lists it as 14 days old.. so probably

janeklb commented 1 year ago

It may be worth noting that we have hot-shots working using protocol: "uds" (though we are occasionally seeing the "congestion" error).

Also, the agent's statsd container is reporting:

datadog-q8mmp agent 2022-11-24 00:03:48 UTC | CORE | WARN | (pkg/dogstatsd/listeners/uds_common.go:211 in Listen) | dogstatsd-uds: error processing origin, data will not be tagged : matched PID for the process is 0, it belongs probably to another namespace. Is the agent in host PID mode?

so we may need to set hostPID: true (as per this guide), but that seems like a different issue

rochdev commented 1 year ago

@janeklb Looks like there was a bug in the agent which should be fixed in the next release. I believe it should also be in the latest release candidate, which as of now is 7.41.0-rc.5

janeklb commented 1 year ago

Thanks @rochdev - does that mean my configuration above is sufficient?

rochdev commented 1 year ago

@janeklb Since you said it was working before with UDP, then yes the only change that should be needed (other than the URL pointing to UDS instead of HTTP) is upgrading the agent.

janeklb commented 1 year ago

@janeklb Since you said it was working before with UDP, then yes the only change that should be needed (other than the URL pointing to UDS instead of HTTP) is upgrading the agent.

Thanks - I'll give it a try when the agent is updated