open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.87k stars 2.24k forks source link

[processor/k8sattributes] Facility to add list of service names(some delimiter seperated) in K8s attribute processor for all signals(Logs, traces, metrics) #32729

Open developer1622 opened 4 months ago

developer1622 commented 4 months ago

Component(s)

processor/k8sattributes

Is your feature request related to a problem? Please describe.

I am unsure if this is a valid request, if there might already be a solution, or if it does not make sense. Please bear with me.

I knew that the metadata mentioned would be added to signals in the k8s attribute processor if we added it to the config.yaml of the Otel collector.

 -> k8s.pod.uid: Str(68e1d4ec-75a3-47f6-93cf-d20496761b4c)
 -> k8s.pod.name: Str(coredns-864597b5fd-5v8d6)
 -> k8s.node.name: Str(single-node-microk8s-cluster)
 -> k8s.namespace.name: Str(kube-system)
 -> container.id: Str(f47bb4e5903c306f19aeab011d4d2324a208f5d9b758c4d442c0e9a9b04b3cf2)
 -> k8s.container.name: Str(coredns)
 -> container.image.name: Str(docker.io/coredns/coredns)
 -> container.image.tag: Str(1.10.1)
 -> k8s.deployment.name: Str(coredns)
 -> k8s.cluster.uid: Str(65f7ed90-e3ad-4f6f-9e0f-fe4b4a4a23da)
 -> k8s.cluster.name: Str(chat-gpt)
 -> k8s.pod.start_time: Str(2024-03-27T11:55:20Z)

Can we add the list of K8s service names to all Signals of OpenTelemetry?

So basically following

 **-> k8s.serive.names: Str(service-name1,service-name2)**

 Thank you.

Describe the solution you'd like

I am not sure of the solution; this is something on top of my mind:

Here, https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/4700e4b262fed6676a6af6191ee3765e64c3c3b1/processor/k8sattributesprocessor/internal/kube/client.go#L514

if the k8s.service.name enabled

Fetch all the K8s services with the namespace of Pod and iterate over all of them whose labels match

Finally, all the services are added as comma-separated values tags["k8s.service.name"] = "tags/map,value,part"

Thank you.

Describe alternatives you've considered

NA

Additional context

For correlation, we might also need context from service names; that's why this proposal.

A way to add the service names in all OTel signals(logs, traces, metrics). Thank you.

When I use the K8s attribute processor, I want to have the K8s.service.names in all my Logs, traces, and Metrics. Thank you for reading.

github-actions[bot] commented 4 months ago

Pinging code owners:

%s See Adding Labels via Comments if you do not have permissions to add labels yourself.

TylerHelmuth commented 3 months ago

I don't think there is any reason we couldn't add service names to telemetry, but we should flush out which service names should be added. Given data from a specific pod, should it add all services the pod uses?

developer1622 commented 3 months ago

@TylerHelmuth Thank you for your response

Here, as far as I know, I might be wrong or inaccurate.

From Pod Spec: We cannot know which service this Pod gets handled, right?

We will have to iterate over all the Services from the namespace on which the Pod is running, select the services, and append to the service name as a comma-separated value whose selector matches with Pod labels.


listOfServicesForPod:=""
listOfSvcs:=FetchAllService(ctx,currentPodNameSpace)
Iterate over listOfSvcs
   find which service selectors are matching( or subset matching) the labels of the current Pod
   listOfServicesForPod+=","+service-I

at the end 
k8s.service.name=listOfServicesForPod

Thank you.

Or is there any other way with cache.SharedInformer of K8s API?

TylerHelmuth commented 3 months ago

Services from the namespace on which the Pod is running

Is a Service limited to only selecting pods in its namespace?

developer1622 commented 3 months ago

I am not quite sure about it, as per my knowledge

A service can only select Pods (putting selector labels) from it's namespace. Thank you.

On Fri, 3 May, 2024, 01:17 Tyler Helmuth, @.***> wrote:

Services from the namespace on which the Pod is running

Is a Service limited to only selecting pods in its namespace?

— Reply to this email directly, view it on GitHub https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/32729#issuecomment-2091430614, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOGE6WVBKTEEHGRYDNPORNDZAKJ3HAVCNFSM6AAAAABG5HMCLCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJRGQZTANRRGQ . You are receiving this because you authored the thread.Message ID: <open-telemetry/opentelemetry-collector-contrib/issues/32729/2091430614@ github.com>

jinja2 commented 3 months ago

If the service has the selectors set, then, yes, the endpoint controller will get pods which have labels matching the service's selectors in the same namespace only. But k8s services can be created without selectors, for e.g. the default/kubernetes service or creating an endpoint object manually with target in another namespace, etc. For these types of services, we'll have to check if the pod ip shows up in any of the endpoints/endpointslices and then get the service for the endpoints. This could increase memory utilization since endpoints contain the address/port of the targets/pods backing the service.

To keep this simple, I think supporting standard services with selectors set should meet most user's requirements. The implementation for this can be, as suggested above, to list/watch service objects, and then check if incoming telemetry's pod has labels matching the selectors of services in the pod's ns.

developer1622 commented 3 months ago

Hi @TylerHelmuth and @jinja2, thank you very much for the response.

To keep this simple, I think supporting standard services with selectors set should meet most user's requirements. The implementation for this can be, as suggested above, to list/watch service objects, and then check if incoming telemetry's PodPhas labels matching the selectors of services in the Pod'sPns.

I have tried two ways to achieve this functionality: to add "k8s.service.name" and "k8s.service.uid".

1) First (listing all the time)

https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/df40aae48a3a3c5c7aefeaf9edc3533d463399e4/processor/k8sattributesprocessor/internal/kube/client.go#L521 , at the end of func (c *WatchClient) extractPodAttributes(pod *api_v1.Pod) map[string]string On-the-fly fetch list of services and iterate over them to find which services which handles this Pod

like below

      // utility function
      // Check if the labels match the selector.
      // Here, a Pod can have many labels. If the service manages Podd, it should have a list of selectors.
      func labelsMatchSelector(labels map[string]string, selector map[string]string) bool {
          for key, value := range selector {
              if labels[key] != value {
                  return false
              }
          }
          return true
      }

    //Now, let's find who handles the above PPodPodsvices, err:= c.kc.CoreV1().Services(pPodNPodmpace).List(context.Background(), meta_v1.ListOptions{})
    if err != nil {
        c.logger.Sugar().Debugf("***** unable to list services from namespace: %s due to error: %s", pod.Namespace, err.Error())
    }

    // to add k8s.service.name
    var commaSeparatedListOfSvcs string // add comma separated list of service list
    if c.Rules.ServiceID || c.Rules.ServiceName {
        for _, svc := range services.Items {
            // Check each Service's selector to see if it matches the PPod'Pod'sls
            if labelsMatchSelector(pod.Labels, svc.Spec.Selector) {
                if len(commaSeparatedListOfSvcs) > 0 {
                    commaSeparatedListOfSvcs += ","
                }
                commaSeparatedListOfSvcs += svc.Name
            }
        }
        tags["k8s.service.name"] = commaSeparatedListOfSvcs
    }

2) Second (Have watch )

Please find the draft PR, which adds a field in the WatchClient object and modifies all the affected files.

https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/32862

Please have a look at this, and respond if possible. Thank you.

fatsheep9146 commented 3 months ago

@developer1622 I wonder that, what do you want to do by using this k8s.service.name info? I ask this problem because, technically, the pod can be selected by more than one services, so the k8s.service.name maybe a slice of names, so it may be very long in some case.

developer1622 commented 3 months ago

@developer1622 I wonder that, what do you want to do by using this k8s.service.name info? I ask this problem because, technically, the pod can be selected by more than one services, so the k8s.service.name maybe a slice of names, so it may be very long in some case.

Hi, @fatsheep9146. Thank you for the response; I need this info for co-relation to build computation around multiple attributes, including k8s.service.name.

Below is my understanding; if it is not possible, please leave it and suggest this; thanks.

I did create a draft PR(looping can be improved with the go routines actually). Could you please have a look at it? Is that the correct way? As the services are slice(a Pod can have multiple services), we can have a comma-separated service list. Thanks.

developer1622 commented 3 months ago

@fatsheep9146, @TylerHelmuth and @jinja2 If possible, can you please look at this and respond to the possibility of adding k8s.service.name and k8s.service?uid, thanks

TylerHelmuth commented 3 months ago

I am interested in the idea and interested in what the implementation would look like

developer1622 commented 3 months ago

I am interested in the idea and interested in what the implementation would look like

@TylerHelmuth, thank you very much for the response. I will work on it and create a PR from my side.

kpattaswamy commented 1 month ago

+1 to this feature, it would be great to see it integrated with the k8sattributes processor