elastic / apm

Elastic Application Performance Monitoring - resources and general issue tracking for Elastic APM.
https://www.elastic.co/apm
Apache License 2.0
371 stars 111 forks source link

Enhance our k8s pod detection #881

Open Mpdreamz opened 1 month ago

Mpdreamz commented 1 month ago

Today agents support sending k8s data if the user explicitly configures pods to be injected withKUBERNETES_NAMESPACE, KUBERNETES_POD_NAME, KUBERNETES_POD_UID, KUBERNETES_NODE_NAME: https://www.elastic.co/guide/en/observability/current/apm-api-metadata.html#apm-api-kubernetes-data and https://github.com/elastic/apm/issues/21#issue-383877963

Whilst discussing if we can improve our host.name detection here, I noticed the following environment variable should be exposed to all pods natively

KUBERNETES_SERVICE_HOST, KUBERNETES_SERVICE_PORT

@trentm also pointed out kubectl itself relies on these to observe its running inside a pod: https://kubernetes.io/docs/reference/kubectl/#in-cluster-authentication-and-namespace-overrides

The suggestion here to start reporting these as agents and to extend the apm intake protocol to actually ingest this information into Elasticsearch.

Lastly we currently set host.name to kubernetes.host_name we detect from KUBERNETES_NODE_NAME: https://github.com/elastic/apm-data/blob/main/model/modelprocessor/hostname.go#L39

However we don't record the inverse, we are running under k8s but KUBERNETES_NODE_NAME was not explicitly configured. I believe we need to record this flag in Elasticsearch so that the Hosts View can filter this data out correctly to fix: https://github.com/elastic/observability-dev/issues/3321.

Sending and recording KUBERNETES_SERVICE_HOST, KUBERNETES_SERVICE_PORT would allow us to detect and record that flag.

xrmx commented 1 month ago

Today agents support sending k8s data if the user explicitly configures pods to be injected withKUBERNETES_NAMESPACE, KUBERNETES_POD_NAME, KUBERNETES_POD_UID, KUBERNETES_NODE_NAME: https://www.elastic.co/guide/en/observability/current/apm-api-metadata.html#apm-api-kubernetes-data and #21 (comment)

Whilst discussing if we can improve our host.name detection here, I noticed the following environment variable should be exposed to all pods natively

KUBERNETES_SERVICE_HOST, KUBERNETES_SERVICE_PORT

@trentm also pointed out kubectl itself relies on these to observe its running inside a pod: https://kubernetes.io/docs/reference/kubectl/#in-cluster-authentication-and-namespace-overrides

The suggestion here to start reporting these as agents and to extend the apm intake protocol to actually ingest this information into Elasticsearch.

Do we really care about the values of KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT that we need to store them? Can we do with them other assuming we are running on a k8s pod?

Lastly we currently set host.name to kubernetes.host_name we detect from KUBERNETES_NODE_NAME: https://github.com/elastic/apm-data/blob/main/model/modelprocessor/hostname.go#L39

However we don't record the inverse, we are running under k8s but KUBERNETES_NODE_NAME was not explicitly configured. I believe we need to record this flag in Elasticsearch so that the Hosts View can filter this data out correctly to fix: elastic/observability-dev#3321.

Sending and recording KUBERNETES_SERVICE_HOST, KUBERNETES_SERVICE_PORT would allow us to detect and record that flag.

Do you want to detect that flag on the apm agent or on the apm server side?

Mpdreamz commented 1 month ago

Do we really care about the values of KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT that we need to store them? Can we do with them other assuming we are running on a k8s pod?

I believe so, KUBERNETES_SERVICE_HOST at least since its the IP of the host and might be used to correlate based on IP?

Do you want to detect that flag on the apm agent or on the apm server side?

This flag can be computed centrally in apm-data and stored in Elasticsearch so it may be queried later by the Hosts UI.

xrmx commented 1 month ago

Do we really care about the values of KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT that we need to store them? Can we do with them other assuming we are running on a k8s pod?

I believe so, KUBERNETES_SERVICE_HOST at least since its the IP of the host and might be used to correlate based on IP?

What do we mean with host? This looks like an internal IP managed by kubernetes that references the kubernetes API rather than the host machine one. Is that ok?

Mpdreamz commented 1 month ago

Ahh I stand corrected, you are right!

In which case it makes sense that we emit only kubernetes.pod_detected: true to apm-server.