Open aramase opened 1 year ago
/kind bug
/triage accepted
/assign
/remove-sig auth
Copying discussion from slack:
Perhaps we can use a k8s-prefixed key in the csi.NodePublishVolumeRequest.secrets
field (that is appended with the secrets from nodePublishVolumeSecretRef
.
The next challenge would be how to roll out this change because the csi tokens feature is already GA
Perhaps we can use a k8s-prefixed key in the csi.NodePublishVolumeRequest.secrets field (that is appended with the secrets from nodePublishVolumeSecretRef.
I was thinking the same while proposing reusing secrets
field. This will work.
The next challenge would be how to roll out this change because the csi tokens feature is already GA
Should we add a feature gate to start off that allows the current behavior of setting tokens in VolumeContext
and set that to false by default in N+2 releases?
v1.29
: ServiceAccountTokenInVolumeContext=true
allows setting the token in VolumeContext
and secrets
field.
v1.30
: ServiceAccountTokenInVolumeContext=false
disabled by default but can be enabled to rely on legacy behavior.
v1.31
: ServiceAccountTokenInVolumeContext=false
and feature gate deprecated, so tokens will only be populated in secrets
do drivers need to opt-into putting the token data in the secret field, or does that get transparently merged to the same location so any driver resolving values would get the token? if opt-in is required, a toggle could be added to TokenRequest to let them do that
do drivers need to opt-into putting the token data in the secret field, or does that get transparently merged to the same location so any driver resolving values would get the token?
drivers read the sa tokens from the VolumeContext
map today as part of their implementation. With the feature gate, I'm thinking we can gradually roll out the change, giving the implementors time to switch from reading tokens from VolumeContext
to secrets
. I think in some release we would want to stop populating the tokens in VolumeContext
.
I don't think time is a sufficient gate to changing the behavior there... I think you need an opt in to relocate the information
Today it is opt-in by setting CSIDriver.TokenRequests.
Perhaps a new field to TokenRequest can be added to control which CSI field it will be passed down through.
TokenRequests is a list of TokenRequest
. If we define a field at the TokenRequest
level, it'll be too granular. Should we define something at the CSIDriver
spec level instead?
Yeah that could work
I agree that dropping the token from the VolumeContext
needs to be an opt-in behavior. If the opt-in is done through the CSIDriver
API, it will be permanently ugly for backwards compatibility reasons. The alternative is a proper CSI feature (new capability) that kubelet could query and plugins could opt-in that way. The upside to fixing this is relatively low, because plugins can work around the problem individually, and any plugin that is likely to opt-in to better behavior could also just implement the workaround.
Another CSI plugin impacted by this issue: https://github.com/kubernetes/kubernetes/issues/124759.
If the
TokenRequests
field is set in theCSIDriver
spec, the tokens are generated and funneled from kubelet to driver as part of volume attributes map. The key for the tokens in the map iscsi.storage.k8s.io/serviceAccount.tokens
. The service account tokens can be used for workload identity in cloud providers. Especially in case of Secrets Store CSI Driver, these service account tokens are used to access external secrets store to get secrets and mount in the volume.As these tokens are sensitive information, it doesn't feel right to include these in the volume context map along with pod name, namespace, service account name which are non-sensitive information. There is a dedicated secrets field in the proto which today is being used for
nodePublishSecretRef
that would be more appropriate for tokens.To provide more context on "why this issue", we had a CVE in the Secrets Store CSI Driver (ref: CVE-2023-2878) where the service account tokens were logged as part of the GRPC request in the driver. We use the protosanitizer in the driver to sanitize the requests before logging but given volume context is not really considered a "secret", the tokens were skipped and logged. We got around this by building custom logic for logging in the driver to strip the tokens but this is something every driver would need to do to be careful and not log the tokens. A change could be made in protosanitizer to sanitize this but given volume context is a map, either entire map needs to be marked a secret or sanitizing has to be done on a field by field basis. Although this prompted me to create this issue, I think volume context is not really "secrets" and appropriate field for sensitive sa tokens would be the secrets field.
/sig storage