open-telemetry / opentelemetry-operator

Kubernetes Operator for OpenTelemetry Collector
Apache License 2.0
1.2k stars 438 forks source link

Block pod creating on error in opentelemetry-auto-intrumentation #1955

Open adecchi-2inno opened 1 year ago

adecchi-2inno commented 1 year ago

I installed opentelemetry-operator and I configured opentelemetry-auto-instrumentations with the correct annotations. Sometimes, some application did not send the telemetry, debuging it I could find that the opentelemetry-auto-instrumentations sidecard was not injected or sometimes opentelemetry-auto-instrumentations got an error but the application in both cases started.

Is possible to set some configuration at the values.yaml of the operator, to be able to not start the application if the opentelemetry-auto-instrumentations container is not running or the container is missing as sidecard ?

TylerHelmuth commented 1 year ago

I could find that the opentelemetry-auto-instrumentations sidecard was not injected

Couple questions:

opentelemetry-auto-instrumentations got an error

Can you share the error? If the initContainer failed to start then the pod should not have started either.

adecchi-2inno commented 1 year ago

@TylerHelmuth Thank you for take time to read.

  1. We are using auto instrumentation with Java, NodeJS application.
  2. I had 2 scenarios: First where the operator was down and the Second the operator was running and Idle.

The initContainer was not injected in the PODs when the operator was down.

When the operator was running the initContainer run with exit code 0 but I got the following error:

  1. reason="ContainerCannotRun"
  2. reason="DeadlineExceeded"
  3. reason="Error"
  4. reason="OOMKilled"
  5. reason="Evicted"

My intention is when I get and any error on the initContainer running or when the InitContainer is not added for any reason to the POD the application does not start. I am not sure, but I think is something related to the allowed field. https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#response But I am not sure, because here show that it is enabled by default

frzifus commented 1 year ago

The initContainer was not injected in the PODs when the operator was down.

Do you mean initContainer or SidecarContainer?

adecchi-2inno commented 1 year ago

I am using it in a java application, so based on it it is a initContainer.

dakr0013 commented 3 months ago

Same problem here. I use java auto-insturmentation. sometimes otel operator is down. if a pod is deployed while operator is down it does not get the initContainer injected and application start anyway. Now I have an application running without metrics or logs available.

Can I somehow make sure that a pod will fail to start if it has the auto instrumentation annotation but does not have instrumentation injected?

Or is there a way to configure operator to restart deployments if it detects that there is an autoinstrumentation annotation but no actual instrumentation (in case of java an init container) injected?