open-telemetry / opentelemetry-go-instrumentation

OpenTelemetry Auto Instrumentation using eBPF
https://opentelemetry.io
Apache License 2.0
440 stars 68 forks source link

when trying go auto instrumentation I got process not found yet #520

Open msherif1234 opened 8 months ago

msherif1234 commented 8 months ago

Describe the bug

Not sure how to make my go app visible to instrumentation pod

Environment

running on OCP cluster

To Reproduce

Steps to reproduce the behavior:

  1. install cert-manager kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.2/cert-manager.yaml

  2. deploy optel operator kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml

  3. create optel collector object

    apiVersion: opentelemetry.io/v1alpha1
    kind: OpenTelemetryCollector
    metadata:
    name: demo
    namespace: default
    spec:
    config: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    processors:
      memory_limiter:
        check_interval: 1s
        limit_percentage: 75
        spike_limit_percentage: 15
      batch:
        send_batch_size: 10000
        timeout: 10s
    
    exporters:
      # NOTE: Prior to v0.86.0 use  instead of .
      debug:
    
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [debug]
        metrics:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [debug]
        logs:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [debug]
    mode: daemonset
  4. create instrumentation object

    kubectl apply -f - <<EOF
    apiVersion: opentelemetry.io/v1alpha1
    kind: Instrumentation
    metadata:
    name: demo-instrumentation
    spec:
    exporter:
    endpoint: http://demo-collector:4317
    propagators:
    - tracecontext
    - baggage
    sampler:
    type: parentbased_traceidratio
    argument: "1"
    EOF
  5. using https://github.com/netobserv/network-observability-operator/pull/500 PR to hack the netobserv operator and enable auto instrumentation for now we need to set OTEL_EXPORTER_OTLP_ENDPOINT manually to match optel svcIP then compile make image-build then make image-push then deploy operator USER=username VERSION="main-amd64" make deploy

  6. create netobserv flow collector oc create -f config/samples/flows_v1beta2_flowcollector.yaml

  7. we should see netobserv agent pods now running with two containers with new one as sidecar for instrumentation

    oc get pods -n netobserv-privileged
    NAME                         READY   STATUS    RESTARTS   AGE
    netobserv-ebpf-agent-2msml   2/2     Running   0          24m
    netobserv-ebpf-agent-7grl5   2/2     Running   0          24m
    netobserv-ebpf-agent-8pgwj   2/2     Running   0          24m
    netobserv-ebpf-agent-n9s6q   2/2     Running   0          24m
    netobserv-ebpf-agent-trq4b   2/2     Running   0          24m
    netobserv-ebpf-agent-whqxs   2/2     Running   0          24m

    Expected behavior

I was expected to instrumentation container to find the app binary and start emitting some form of metrics but I am getting

{"level":"info","ts":1700489757.6685278,"logger":"Instrumentation.Analyzer","caller":"process/discover.go:73","msg":"process not found yet, trying again soon","exe_path":"/netobserv-ebpf-agent"}

Additional context

Used instructions doc here https://opentelemetry.io/docs/kubernetes/operator/automatic/

pellared commented 8 months ago

Can you double-check if the Go instrumentation and application containers share the process namespace?

Reference:

msherif1234 commented 8 months ago

Can you double-check if the Go instrumentation and application containers share the process namespace?

Reference:

this is what I see in the container logs

{"level":"info","ts":1700514825.3263397,"logger":"Instrumentation.Controller","caller":"opentelemetry/controller.go:54","msg":"got event","attrs":[{"Key":"net.peer.port","Value":{"Type":"STRING","Value":"2055"}},{"Key":"rpc.system","Value":{"Type":"STRING","Value":"grpc"}},{"Key":"rpc.service","Value":{"Type":"STRING","Value":"/pbflow.Collector/Send"}},{"Key":"net.peer.name","Value":{"Type":"STRING","Value":"10.0.128.4"}}]}
2023/11/20 21:13:45 traces export: Post "https://localhost:4318/v1/traces": dial tcp [::1]:4318: connect: connection refused

where 10.0.128.4 is the podIP

lel-war commented 6 months ago

Hi everyone! I am having the same issue when instrumenting Go using the operator. {"level":"info","ts":1705690630.9377563,"logger":"Instrumentation.Analyzer","caller":"process/discover.go:73","msg":"process not found yet, trying again soon","exe_path":"/app"}

I am using the following autoinstrumentation library:

ghcr.io/open-telemetry/opentelemetry-go-instrumentation/autoinstrumentation-go:v0.10.1-alpha

I can confirm the pods have the config:

shareProcessNamespace: true

I can also confirm that the container gets injected with the following attribute:

securityContext: privileged: true runAsUser: 0

Am I missing something? Thanks in advance!

RonFed commented 6 months ago

Hi everyone! I am having the same issue when instrumenting Go using the operator. {"level":"info","ts":1705690630.9377563,"logger":"Instrumentation.Analyzer","caller":"process/discover.go:73","msg":"process not found yet, trying again soon","exe_path":"/app"}

I am using the following autoinstrumentation library:

ghcr.io/open-telemetry/opentelemetry-go-instrumentation/autoinstrumentation-go:v0.10.1-alpha

I can confirm the pods have the config:

shareProcessNamespace: true

I can also confirm that the container gets injected with the following attribute:

securityContext: privileged: true runAsUser: 0

Am I missing something? Thanks in advance!

@lel-war Are you using OTEL_GO_AUTO_TARGET_EXE or instrumentation.opentelemetry.io/otel-go-auto-target-exe? Is your go executable full path /app (as seen to passed to the instrumentation in the log you attached)

lel-war commented 6 months ago

Hi @RonFed thanks for the quick response. To answer your question I am using the following annotation:

instrumentation.opentelemetry.io/otel-go-auto-target-exe: /app

The value "/app" is just an example of the real application, in reality it looks more like /home/user/app. So to answer your question, yes!

Morsicus commented 3 weeks ago

Hello there!

Out of curiosity, did you find any solution?

I'm having the same issue.

I created a debug/ephemeral container in order to verify the path of the executable and it seems to be the correct one.

Am I missing something? Do you have any idea?