Closed zdtsw closed 1 month ago
Is this related to https://issues.redhat.com/browse/RHOAIENG-9806, as some preliminary stop-gap?
some data from testing in a large cluster psi-04
Wouldn't data from small clusters be more relevant for the actual value we aim for?
Do we need for example CPU requests
at all? What functionality will suffer when the Operator becomes CPU-starved?
And vice versa -- do we need to lower the limit at all?
Is this related to https://issues.redhat.com/browse/RHOAIENG-9806, as some preliminary stop-gap?
some data from testing in a large cluster psi-04
Wouldn't data from small clusters be more relevant for the actual value we aim for?
Do we need for example CPU
requests
at all? What functionality will suffer when the Operator becomes CPU-starved?And vice versa -- do we need to lower the limit at all?
I would take a step by step to see if this can make the "large" cluster working first, then we can go even more fine tuning to do the low boundary for "small" cluster.
Do we need for example CPU requests at all? I am not sure i understand this question ? you mean do not set requests.cpu at all? then the operator pod get first throttling or evicted, is this what we want?
to have a high "limit" (to keep what we have now) i would not say do much harm, but it impacts k8s node selection. ofc, if we are talking about SNO i guess there is no such needs for consideration. lower or higher "limit" is the same
I agree with @adelton to use data from small clusters to set defaults. The jira issue linked has data from PSAP team
tbh, when i started this PR, i did not know this jira ticket. Mainly was from some test we did for another case. Then I recalled we had an old issue regarding resource utilization enhancement, so I submitted this PR after we finalized certain tests.
one thing on my mind after reading your comments: for ticket https://issues.redhat.com/browse/RHOAIENG-9806 , should we use the same data from pref test in ODH? I would assume these data were collected from downstream build. we can use it to set for downstream but how you feel we should use the same value in ODH (if it is not for the sake of sync code)
one thing on my mind after reading your comments: for ticket https://issues.redhat.com/browse/RHOAIENG-9806 , should we use the same data from pref test in ODH? I would assume these data were collected from downstream build. we can use it to set for downstream but how you feel we should use the same value in ODH (if it is not for the sake of sync code)
For the benefit of the folks who might not have access to the internal information, it might be useful to get the data from an ODH installation and share them here or in some other public place, so that the reasons for the numerical changes are documented. I would assume the numbers from ODH and downstream don't differ much, so if we can use and publish the numbers we got for downstream really depends on whether they are considered internal-only or not.
Is this also related to https://issues.redhat.com/browse/RHOAIENG-494?
Is this also related to https://issues.redhat.com/browse/RHOAIENG-494?
I dont think so, but more for https://issues.redhat.com/browse/RHOAIENG-9806
I dont think so, but more for https://issues.redhat.com/browse/RHOAIENG-9806
And specifically https://issues.redhat.com/browse/RHOAIENG-10889, it seems.
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: adelton
The full list of commands accepted by this bot can be found here.
The pull request process is described here
related to https://issues.redhat.com/browse/RHOAIENG-9806
How Has This Been Tested?
Screenshot or short clip
Merge criteria