Closed irfanurrehman closed 1 year ago
This is difficult to find. Great catch. @irfanurrehman
Append a pointer into the diffList
slice, that pointer keeps the same but the pointed object has been changed as the iteration goes. As a result, newPendingInvalidPods
returns a list with our clone pods and we kill it mistakenly.
@irfanurrehman btw, could you share how you check the api-server
audit log? I didn't see it in your comments on that JIRA. Thanks
@irfanurrehman btw, could you share how you check the
api-server
audit log? I didn't see it in your comments on that JIRA. Thanks
I used this doc from openshift -> https://docs.openshift.com/container-platform/4.10/security/audit-log-view.html
Just a generic comment, can we change the log level from 5
to 4
? I believe 4
is usually the highest logging level we ask customer to set for debugging purposes.
Just a generic comment, can we change the log level from
5
to4
? I believe4
is usually the highest logging level we ask customer to set for debugging purposes.
thanks for the suggestion @ading1977. Updated!
Intent
This fixes a code issue with finding and deleting the pending pods created becaue of update of parent replicaset to a dummy scheduler as part of pod moves. Fixes https://jsw.ibm.com/browse/TRB-42713
This also adds few more logs to make the pod move steps more clear
Background
This is a subtle code bug because of how a loop variable is treated in golang. Some explaination below: The issue is in below existing code snippet:
in golang below
is equivalant to
pod1
being a single variable declared for the loop and the values simply update into the same variable in each loop iteration, rather thenpod1
being a new variable for each loop iteration.This means that each element (address) added to the diffList above (
diffList = append(diffList, &pod1)
) points to the same loop variable, with each element pointing to the value received in the last loop iteration.Fix is as updated in this PR
Testing
Tested manually, a sequential execution of 6 pods with the fix, no issues are observed as before
old pods
new pods after the move
watch on events
watch on pods
pendig pods deletion logic can still find pending pods and deletes them if they are left behind
Checklist
These are the items that must be done by the developer and by reviewers before the change is ready to merge. Please
strikeoutany items that are not applicable, but don't delete them[ ] Unit tests added / updated[ ] Integration tests added / updated[ ] Product sweep run and passed[ ] Developer wiki updated (and linked to this description)Audience
(@ mention any
review/...
groups or people that should be aware of this merge request)