Closed andresm53 closed 9 months ago
Hi @jaronoff97, I'm interested in this issue, if UpgradeStrategy
is expected in the InstrumentationSpec
, I can help this out with similar upgrade logic implemented for OpentelemetryCollector
.
The problem is that during OpenTelemetry Operator automatic upgrades, the cpu limits are added back again
I didn't find a code occurrence where the CPI limit is defaulted when it is nil
. The limits are default when they are missing for both CPU and memory https://github.com/open-telemetry/opentelemetry-operator/blob/f2b5e3c6c9bbc626af32101b2a69c22c6e7c732e/apis/v1alpha1/instrumentation_webhook.go#L106
Maybe you can try to set the CPU limit to zero. The operator should not override it
. If it is set to 0, it means no resource limit for the container. In particular, if you set limits without specifying requests, Kubernetes considers the value of requests is the same as that of limits by default.
@changexd sure, if you want to take a look here that would be great. I think Pavol's suggestion is accurate, may be start with testing that?
@andresm53 Is it possible to provide steps to reproduce this issue with more information?
I've tried ways to reproduce the issue, but it all leads to what @pavolloffay has mentioned, the instrumentation
resource itself is only defaulted when resource.limit == nil
meaning after the upgrade, nothing from the instrumentation is changed with this configuration
resources:
limits:
memory: 64Mi
requests:
cpu: 50m
memory: 64Mi
@changexd I just tried again. And I wasn't able to repro the issue either. What I did was (this is Openshift 4.12):
Install Opentelemetry Operator v0.83
Configure instrumentation resource like you did:
resources:
limits:
memory: 64Mi
requests:
cpu: 50m
memory: 64Mi
Upgrade Operator to 0.84 then 0.85 etc. until 0.90
The cpu limit was not re-configured during or after the upgrade process.
So, going back to my original problem, something else must have re-configured the cpu limits, and now it seems that it wasn't the Operator upgrade. I can't think of anything else, in my environment, besides the Operator, that could have re-configured that...
Just in case I tried again, and (again) I wasn't able to repo this. Sorry about that. I think I can close this issue as "not reproducible", if you agree.
Since this seems to be non reproducible on both sides, we can probably close this issue, feel free to open issues in the future if you encounter any. Thank you : )
@changexd @andresm53 thank you both for working on this! :bow:
Component(s)
operator, instrumentation
Describe the issue you're reporting
Hi All,
My instrumentation resource look like this:
Those resource limits are used by the init container, and because it has cpu limits, in Openshift, my pod metrics chart looks like this:
Problem is, my application container doesn't have cpu limits, but as you can see in the above print screen, Openshift think it has (because the init container has) and reports the pod limit metric incorrectly (actually, it's not Openshift fault; it's just using the kube state metric
kube_pod_resource_limit
)To fix this, I can just remove cpu limit from the instrumentation resource:
The problem is that during OpenTelemetry Operator automatic upgrades, the cpu limits are added back again. Is there a way to prevent this? I tried configuring
.Spec.UpgradeStrategy
tonone
in the instrumentation resource, but it doesn't support it.Note, this issue doesn't affect my pod performance, as I mentioned, the app container doesn't have cpu limits. This issue is more a cosmetic issue, it affects the Openshift metrics cpu charts.
Thanks.