open-telemetry / opentelemetry-operator

Kubernetes Operator for OpenTelemetry Collector
Apache License 2.0
1.21k stars 440 forks source link

auto-instrumentation: can't preserve ownership of initcontainer volume #2616

Open leofvo opened 9 months ago

leofvo commented 9 months ago

Component(s)

instrumentation

What happened?

Description

I using python auto-instrumentation in a kubernetes cluster. I have installed the operator and the instrumentation CRD (all up to date). I have deployed about 6 services, all using fastapi. And one of my service isn't sending metrics and traces to my collector. The collector is the same, operator same, same namespace and the instrument CR too. image

Steps to Reproduce

Expected Result

Init container booting and finished without any issues..

Actual Result

When my pods is starting, the init container launch and this kind of error message appears: cp: can't preserve ownership of '/otel-auto-instrumentation-python/./zipp-3.17.0.dist-info/top_level.txt': Operation not permitted

I can't get traces or metrics exported to my collector...

Here is a screenshot of the initcontainer logs:

image

Kubernetes Version

v1.28.5-eks-5e0fdde

Operator version

v0.93.0

Collector version

v0.90.1

Environment information

││     app.kubernetes.io/component: controller-manager
││     app.kubernetes.io/instance: opentelemetry-operator
││     app.kubernetes.io/managed-by: Helm
││     app.kubernetes.io/name: opentelemetry-operator
││     app.kubernetes.io/version: 0.93.0
││     argocd.argoproj.io/instance: observability
││     helm.sh/chart: opentelemetry-operator-0.47.0

Log output

No response

Additional context

No response

leofvo commented 9 months ago

Update: I tried to upgrade the volume size limit, still having the issue.

It doesn't seem to be related to the available storage limit.

illrill commented 9 months ago

Can you post the application's Deployment manifest?

I've faced the same issue with the Node.js instrumentation, which I've been able to pinpoint to container settings runAsNonRoot: true (this gets preserve ownership errors) vs runAsNonRoot: false (everything works).

leofvo commented 9 months ago

I had runAsNonRoot: true, I removed it and now my traces are collected, but still having the error message in my initContainer logs. Thanks for the help !

smallc2009 commented 8 months ago

this issue is also related my one #2726 . it looks like it doesn't affect the collecting trace with these message, see #1013 , but still need a proper way to avoid.

iblancasa commented 8 months ago

I would say #2695 should fix this issue