Closed lukeweber closed 2 years ago
Welcome @lukeweber!
It looks like this is your first PR to kubernetes-sigs/cluster-api-provider-nested š. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.
You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.
You can also check if kubernetes-sigs/cluster-api-provider-nested has its own contribution guidelines.
You may want to refer to our testing guide if you run into trouble with your tests not passing.
If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!
Thank you, and welcome to Kubernetes. :smiley:
Hi @lukeweber. Thanks for your PR.
I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test
on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test
label.
I understand the commands that are listed here.
Apologies for excessive pushes. Was trying to get my verified commit to work correctly.
/ok-to-test
@Fei-Guo or @charleszheng44 have either of you run into this before? We ran into it in some testing and want to make sure this doesn't break anything you have been using.
I don't know the use case of this optional
field. Do you mind telling us when this field is used? Changing this field requires Pod recreation, what is the difference between flipping a boolean compared to adding a secret in volume?
Per the original issue #231 - If someone creates a pod with an optional secret, and the secret is never created, the following should be a valid / runnable pod. However, with the current check that assume all secrets are service accounts and thus required to be synced before scheduling the pod in the super cluster, the pod will never be synced to the super cluster by vc-syncer which is incorrect.
Example pod
apiVersion: v1
kind: Pod
metadata:
name: optional-secret-test
spec:
containers:
- image: busybox
name: test
command: ['sh', '-c', 'while true; do echo "alive" && sleep 3600; done']
restartPolicy: Always
volumes:
- name: optional-password
secret:
optional: true
secretName: optional-password
Yes, I understand the logic but don't quite understand the use case and I am curious if someone is really using this field. Is this from a conformance test?
Yes, I understand the logic but don't quite understand the use case and I am curious if someone is really using this field. Is this from a conformance test?
I don't believe it's a conformance test but more a standard feature on the pod contract that we just don't support today with vc-syncer. I know I have used this field on pods before to support optional configurations.
If the flag is true and secret does not exists, we should be ok. The problem is when the flag is true and the secret exists, your change allows a race such that the user created secret and Pod in order in VC but the syncer creates a Pod in super without the secret volume because the syncer cannot enforce mutual synchronization order for any objects in VC. In my option, I'd rather claim we don't support this field. What do you think?
If the flag is true and secret does not exists, we should be ok. The problem is when the flag is true and the secret exists, your change allows a race such that the user created secret and Pod in order in VC but the syncer creates a Pod in super without the secret volume because the syncer cannot enforce mutual synchronization order for any objects in VC. In my option, I'd rather claim we don't support this field. What do you think?
I don't think that is right in how it was implemented upstream prior to this PR, cause right now that code only checks for if the ServiceAccount Token Secret exists in the super before syncing, all other secret volumes are ignored since this can be handled as eventually consistent expect a ServiceAccount token Secret and a ServiceAccount token Secret shouldn't ever be Optional
. See the linked code below for where we skip over non-SA token secrets.
I personally think this is a feature we should support given it's scoped pretty specific to optional non-SA secrets. Then again I might be misunderstanding the code.
I agree we have the tiny race problem in general for non SA secrets or other volume sources with optional fields being true.
/lgtm /approve
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: Fei-Guo, lukeweber
The full list of commands accepted by this bot can be found here.
The pull request process is described here
What this PR does / why we need it:
Remove a check for optional secrets in pod syncer. The pod resource syncer checks that all secrets defined in volumes in the pod spec exist in the virtual cluster. The intent is to ensure that service account tokens exist before the pod is scheduled. The minimal change is to only verify secrets that are marked as required. If it's optional, we remove it from the check.
We should confirm that ignoring optional doesn't compromise any of the security assumptions of the current model in terms of us making sure a service account token is properly synced before a pod is created.
We do already enforce this behavior with syncer optionally setting AutomountServiceAccountToken on the sync'd pods spec:
Note: The negative case is already covered with tests that verify that a ServiceAccount secret not existing will cause the syncer to fail. I'm open to adding to the tests if we feel it's useful, but my judgement was to not add them.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #231