Open santoshkashyap opened 1 year ago
Is the hibernation shutting down all pods? If that is the case I would say that the OTEL operator starts after the application pods.
(The OTEL operator as well uses the admission mutating webhook to install the auto-instrumentations)
Is there a way you could control the starting order of the pods- give the infra/otel operator pods a higher priority?
Thanks for the pointer. I will verify this and update you again tomorrow after another hibernation.
Is the hibernation shutting down all pods? If that is the case I would say that the OTEL operator starts after the application pods.
Yes, on our cluster this seems to be the case. OTEL operator seems to start after the application pods. To mitigate this issue, I created a Priority class and assigned it to the
opentelemetry-operator-controller-manager
pod. I will update again tomorrow after cluster hibernation whether this approach works.
@santoshkashyap any news on this ticket?
Can we close it?
As a fix maybe we could set the priority class by default?
Unfortunately, this still doesn't seem to work.
I have assigned higher priority for operator
application pod still have no priority class assigned, hence defaults to '0'.
With this setup, I still see application pods are in running
state while the OTEL operator is still in container creating
Also it seems to be that even though OTEL operator is scheduled early, it seems to take sometime to complete container creation.
A workaround we are discussing is to have some kind of cronjob that runs daily to rollout restart application deployment after resumption. Meanwhile if there is anything I can try, please let me know.
@santoshkashyap is this still a problem? We've refactored how reconciliation works which i think should help with this.
Hi @jaronoff97, yes, this issue still exists in version 0.96.0. Feel free to ping me if you need any assistance in resolving this issue.
@M1lk4fr3553r do you have an easy way to reproduce this? I run the operator locally on a kind cluster with autoinstrumentation and it idles and awakes fine.
I have created this chart to show the issue.
Once you deploy the chart, you will notice that the pod created from deployment-to-instrument
has not been injected.
This occurs because the pod is created before the operator pod is ready to inject other pods. (This behavior is documented here)
There should ideally be a way to let the operator restart pods that should be injected but aren't.
@M1lk4fr3553r this is a limitation of our current webhook configuration. Right now we only get the injection events on pod creation (see here) and I'm not sure the best way to get around that. The Istio operator functions the same way, I wonder if they have a way of solving this issue... I'll ask around and see if there's anything we can do here.
Having the operator delete arbitrary Pods sounds like a dangerous capability that I'd rather not add unless we have no other choice.
If you'd like your Pods to wait until the operator starts and is able to inject instrumentation, you can set the webhook failurePolicy
to Fail
. The Pod will be rejected by the API Server, and its controller will start retrying until successful.
This is a dangerous setting, as by default it will reject ALL Pods, the operator itself included. If you go down this path, please make sure to also set objectSelector
on the webhook to ignore your critical system services.
Having the operator delete arbitrary Pods sounds like a dangerous capability that I'd rather not add unless we have no other choice.
I would not simply delete the pods, I was thinking of rollout restart
ing the deployment. That way a new pod can spin up before and there should be no danger of a downtime.
Also, in any case, this should be an option that is off by default, since I doubt that anyone is shutting down their production cluster every day. For development and integration clusters, it does not seem uncommon to shut them down during non-working hours to save money.
I would suggest trying out the webhook settings first, since that seems like a more idiomatic solution to your problem. If you want a rolling restart of your Deployments/StatefulSets/DaemonSets, you can always create a Job with a simple Go program (or even a bash script) that waits until the operator is ready, and then takes care of the restarts. You then have control over what exactly happens to your workloads and in which order.
To my knowledge https://github.com/Azure/AKS/issues/4002 currently prevents setting the objectSelector correctly in AKS through the helm chart, which means that right now there is no reliable way of using auto instrumentation and sidecar injection with the operator helm chart in AKS. Also, I think that the default settings in the helm chart need to prevent this issue, since it initially isn't obvious.
The way this is done in dapr (periodically checking and deleting pods with missing injected sidecars) may not be perfect, but it works for me.
We have our K8S development clusters set to hibernate everyday at the end of regular work hours. The cluster becomes active the next day. We have setup opentelemetry-operator on our cluster. Also configured OpenTelemetry Collector as a Daemon with corresponding annotations on the pod (Java/NodeJS) For Java:
For NodeJS:
With this setup, everything works fine. For example, for Java apps the JavaAgent is volume mounted automatically. The agent instruments the application and ships the traces to a OpenTelemetry collector pod (created via Otel CR via the operator). Finally, the collector pod ships the traces to our Observability backend service. However, when the workload resumes the next day after hibernation everything seems to be lost (see screen shot below). Not sure why it happens ? There is not much information in the application logs or OpenTelemetry daemon pod logs or even in the
opentelemetry-operator-controller-manager
pod in theopentelemetry-operator-system
namespace.Container spec before hibernation:
After resumption from hibernation: OpenTelemetry setup is lost
Thanks in advance!!!