open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.92k stars 2.28k forks source link

[receiver/statsd] add attribute for source IP #15290

Closed seankhliao closed 1 year ago

seankhliao commented 1 year ago

Is your feature request related to a problem? Please describe.

We want to enrich the metrics data we receive over the statsd protocol via the k8sattributes processor. This requires having something to identify the source of metrics, either IP address, name, or uid.

Using the source IP address seems ideal as it doesn't require cooperation from the process producing the metrics.

Describe the solution you'd like

An attribute containing the source IP of the packet each metric came from.

Describe alternatives you've considered

Getting producers to include self identifying data: not always easy/possible Run collector as sidecar: resource expensive and hard to coordinate well (eg k8s Jobs)

Additional context

No response

github-actions[bot] commented 1 year ago

Pinging code owners: @jmacd @dmitryax. See Adding Labels via Comments if you do not have permissions to add labels yourself.

seankhliao commented 1 year ago

Thinking about doing this, do I need to make the attribute the peer address configurable? net.sock.peer.addr seems to be the semconv name for the address, but there's the risk of overwriting an existing attribute.

dmitryax commented 1 year ago

@seankhliao Maybe instead of introducing another attribute, we can follow the same approach as done for tracing where the source ip is passed in a client request context. Please take a look into that

seankhliao commented 1 year ago

Once parsed, the statsd receiver mixes everything into a few maps for aggregation. The metrics are only passed on the next consumer on flush intervals disassociated with the incoming requests. I don't really see a place to store the connection context, or a way to group the metrics by context before passing to the next consumer.

dmitryax commented 1 year ago

Ok makes sense, it probably should be set as a resource attribute in that case, maybe configured in settings. Not sure about using net.sock.peer.addr since it's a span attribute. Should it be a resource attribute? Does statsd metrics reporter send metrics about itself or not necessary?

seankhliao commented 1 year ago

The statsd reporters I've seen are usually unaware of their own identity. Resource attribute would make sense except that I don't think there's a place to store resource attributes? unless we add a new member to https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/statsdreceiver/protocol/statsd_parser.go#L119 Per metric attribute will probably work if combined with the groupbyattribute processor. Didn't notice net.sock.peer.addr is only for spans, a configurable key sounds like the best option then.

jsirianni commented 1 year ago

I would love to have this. Right now I am trying to combine statsd with Kubernetes Attributes Processor, which can use a pod ip address to lookup additional metadata, like pod name and such. This would be useful when using statsd in a kubernetes environment.

jsirianni commented 1 year ago

Additionally, it would be nice if the resource key was configurable. the K8s attributes processor can look for k8s.pod.ip, which would mean renaming net.sock.peer.addr (or whatever ends up being implemented. Not a big deal.

github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

seankhliao commented 1 year ago

I had a go at implementing it with a configurable key in #18276

Though it uses an entire address (ip:port for UDP), so users might need to run another processor to split out a plain IP if downstream (eg k8sattributesprocessor) needs that.

jmacd commented 1 year ago

https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/18276#issuecomment-1467111166

github-actions[bot] commented 1 year ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

codeboten commented 1 year ago

This was addressed by https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/18276