grafana / alloy

OpenTelemetry Collector distribution with programmable pipelines
https://grafana.com/oss/alloy
Apache License 2.0
1.36k stars 195 forks source link

Error: "missing pod namespace label" when using discovery.kubernetes.node #431

Open SamsonChungStackAdapt opened 1 year ago

SamsonChungStackAdapt commented 1 year ago

What's wrong?

When I try to use the component discovery.kubernetes on node, I get the following error:

err="missing pod namespace label"

Seems like it is calling: https://github.com/grafana/agent/blob/5ef22ad8b83cec324ebc5a607d6c0b93fac54a67/component/loki/source/kubernetes/kubernetes.go#L179 and failing with this error message: https://github.com/grafana/agent/blob/5ef22ad8b83cec324ebc5a607d6c0b93fac54a67/component/loki/source/kubernetes/kubetail/target.go#L190

Steps to reproduce

Use component discovery.kubernetes on node

System information

No response

Software version

No response

Configuration

discovery.kubernetes "nodes" {
  role = "node"
}

discovery.relabel "node_relabel" {
  targets = discovery.kubernetes.nodes.targets

  rule {
    action        = "replace"
    source_labels = ["__meta_kubernetes_node_name"]
    target_label  = "k8s_node_name"
  }
}

loki.source.kubernetes "nodes" {
  targets    = discovery.relabel.node_relabel.output
  forward_to = [loki.write.remote.receiver]
}

Logs

ts=2023-06-08T15:09:02.09288915Z component=loki.source.kubernetes.nodes level=error msg="failed to process input target" target="{__address__=
...
}" err="missing pod namespace label"
rfratto commented 1 year ago

I think this is a documentation problem. loki.source.kubernetes is intended to be used to collect Pod logs, not Node logs; it's not currently supported to chain role = "node" to it.

SamsonChungStackAdapt commented 1 year ago

Oh is this page not accurate? https://grafana.com/docs/agent/latest/flow/reference/components/discovery.kubernetes/

How can I collect information on nodes, services, etc?

rfratto commented 1 year ago

role = "node" can be primarily combined with discovery.relabel to collect metrics from nodes, such as cAdvisor metrics. I have an example config which does exactly that.

It's a similar situation with role = "service"; non-pod roles are typically more useful for metric collection.

But the documentation for loki.source.kubernetes should be updated to make it more clear that it's only for pods, and can't collect logs from nodes.

SamsonChungStackAdapt commented 1 year ago

I see, thanks for clarifying and providing an example!

Eve832 commented 1 year ago

Not clear who should make these edits. @clayton-cornell and @BeverlyJaneJ can you please decide and triage?